Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Next Article in Journal
Minimal and Primitive Terracini Loci of a Four-Dimensional Projective Space
Next Article in Special Issue
Hyper-Heuristic Approach for Tuning Parameter Adaptation in Differential Evolution
Previous Article in Journal
Hybrid Quantum Genetic Algorithm with Fuzzy Adaptive Rotation Angle for Efficient Placement of Unmanned Aerial Vehicles in Natural Disaster Areas
Previous Article in Special Issue
Constraint Qualifications for Vector Optimization Problems in Real Topological Spaces
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Application of Gradient Optimization Methods in Defining Neural Dynamics

by
Predrag S. Stanimirović
1,2,
Nataša Tešić
3,
Dimitrios Gerontitis
4,
Gradimir V. Milovanović
5,
Milena J. Petrović
6,*,
Vladimir L. Kazakovtsev
2 and
Vladislav Stasiuk
2
1
Faculty of Sciences and Mathematics, University of Niš, 18000 Niš, Serbia
2
Laboratory “Hybrid Methods of Modelling and Optimization in Complex Systems”, Siberian Federal University, Prosp. Svobodny 79, 660041 Krasnoyarsk, Russia
3
Department of Mathematics and Informatics, Faculty of Sciences, University of Novi Sad, 21000 Novi Sad, Serbia
4
Department of Information and Electronic Engineering, International Hellenic University, 57400 Thessaloniki, Greece
5
Mathematical Institute, Serbian Academy of Sciences and Arts, Kneza Mihaila 35, 11000 Belgrade, Serbia
6
Faculty of Sciences and Mathematics, University of Pristina in Kosovska Mitrovica, Lole Ribara 29, 38220 Kosovska Mitrovica, Serbia
*
Author to whom correspondence should be addressed.
Axioms 2024, 13(1), 49; https://doi.org/10.3390/axioms13010049
Submission received: 1 November 2023 / Revised: 24 December 2023 / Accepted: 11 January 2024 / Published: 14 January 2024
(This article belongs to the Special Issue Numerical Analysis and Optimization)

Abstract

:
Applications of gradient method for nonlinear optimization in development of Gradient Neural Network (GNN) and Zhang Neural Network (ZNN) are investigated. Particularly, the solution of the matrix equation A X B = D which changes over time is studied using the novel GNN model, termed as GGNN ( A , B , D ) . The GGNN model is developed applying GNN dynamics on the gradient of the error matrix used in the development of the GNN model. The convergence analysis shows that the neural state matrix of the GGNN ( A , B , D ) design converges asymptotically to the solution of the matrix equation A X B = D , for any initial state matrix. It is also shown that the convergence result is the least square solution which is defined depending on the selected initial matrix. A hybridization of GGNN with analogous modification GZNN of the ZNN dynamics is considered. The Simulink implementation of presented GGNN models is carried out on the set of real matrices.

1. Introduction and Background

Recurrent neural networks (RNNs) are an important class of algorithms for computing matrix (generalized) inverses. These algorithms are used to find the solutions of matrix equations or to minimize certain nonlinear matrix functions. RNNs are divided into two subgroups: Gradient Neural Networks (GNNs) and Zhang Neural Networks (ZNNs). The GNN design is explicit and mostly applicable to time-invariant problems, which means that the coefficients of the equations that are addressed are constant matrices. ZNN models can be implicit and are able to solve time-varying problems, where the coefficients of the equations depend on the variable t R , t > 0 , representing time [1,2,3].
The Moore–Penrose inverse of A R p × n is the unique matrix A = X R n × p which is the solution to the well-known Penrose equations [4,5]:
A = A X A , X = X A X , A X = ( A X ) T , X A = ( X A ) T ,
where ( ) T denotes the transpose matrix. The rank of a matrix A, i.e., the maximum number of linearly independent columns in A, is denoted by rank ( A ) .
Applications of linear algebra tools and generalized inverses can be found in important areas such as the modeling of electrical circuits [6], the estimation of DNA sequences [7] and the balancing of chemical equations [8,9], as well as in other important research domains related to robotics [10] and statistics [11]. A number of iterative methods for solving matrix equations based on gradient values have been proposed [12,13,14,15].
In the following sections, we will focus on GNN and ZNN dynamical systems based on the gradient of the objective function and their implementation. The main goal of this research is the analysis of convergence and the study of analytic solutions.
Models with GNN neural designs for computing the inverse or the Moore–Penrose inverse and linear matrix equations were proposed in [16,17,18,19]. Further, various dynamical systems aimed at approximating the pseudo-inverse of rank-deficient matrices were developed in [16]. Wei, in [20], proposed three RNN models for the approximation of the weighted Moore–Penrose inverse. Online matrix inversion in a complex matrix case was considered in [21]. A novel GNN design based on nonlinear activation functions (AFs) was proposed and analyzed in [22,23] for solving the constant Lyapunov matrix equation online. A fast convergent GNN aimed at solving a system of linear equations was proposed and numerically analyzed in [24]. Xiao, in [25], investigated the finite-time convergence of an appropriately accelerated ZNN for the online solution of the time-varying complex matrix equation A ( t ) X ( t ) = B ( t ) . A comparison with the corresponding GNN design was considered. Two improved nonlinear GNN dynamical systems for approximating the Moore–Penrose inverse of full-row or full-column rank matrices were proposed and considered in [26]. GNN-type models for solving matrix equations and computing related generalized inverses were developed in [1,3,13,16,18,20,27,28,29]. The acceleration of GNN dynamics to a finite-time convergence has been investigated recently. A finite-time convergent GNN for approximating online solutions of the general linear matrix equation A X ( t ) B + C X ( t ) D = B was proposed in [30]. This goal was achieved using two activation functions (AFs) in the construction of the GNN. The influence of AFs on the convergence performance of a GNN design for solving the matrix equation A X B + X = C was investigated in [31]. A fixed-time convergent GNN for solving the Sylvester equation was investigated in [32]. Moreover, noise-tolerant GNN models equipped with a suitable activation function (AF) able to solve convex optimization problems were developed in [33].
Our goal is to solve the equation A X B = D and apply its particular cases in computing generalized inverses in real time by improving the GNN model developed in [34]. The developed dynamical system is denoted by GNN ( A , B , D ) . Or motivation is to improve the GNN model denoted by GNN ( A , B , D ) and develop a novel gradient-based GGNN model, termed GGNN ( A , B , D ) , utilizing a novel type of dynamical system. The proposed GGNN model is based on the standard GNN dynamics along the gradient of the standard error matrix. The convergence analysis reveals the global asymptotic convergence of GGNN ( A , B , D ) without restrictions, while the output belongs to the set of general solutions to the matrix equation A X B = D .
In addition, we propose gradient-based modifications of the hybrid models developed in [35] as proper combinations of GNN and ZNN models for solving the matrix equations B X = D and X C = D with constant coefficients. Analogous hybridizations for approximating the matrix inverse were developed in [36], while two modifications of the ZNN design for computing the Moore–Penrose inverse were proposed in [37]. Hybrid continuous-gradient–Zhang neural dynamics for solving linear time-variant equations were investigated in [38,39]. The developed hybrid GNN-ZNN models in this paper are aimed at solving the matrix equations A X = B and X C = D , denoted by HGZNN ( A , I , B ) and HGZNN ( I , C , D ) , respectively.
The implementation was performed in MATLAB Simulink, and numerical experiments were performed with simulations of the GNN, GGNN and HGZNN models.
The GNN used to solve the general linear matrix equation A X B = D is defined over the error matrix E ( t ) = D A V ( t ) B , where t [ 0 , + ) is time, and V ( t ) is an unknown state-variable matrix that approximates the unknown matrix X in A X B = D . The goal function is ε ( t ) = | | D A V ( t ) B | | F 2 / 2 , where · F = i j a i j 2 denotes the Frobenius norm of a matrix. The gradient of ε ( t ) is equal to
ε ( t ) V = ε = 1 2 | | D A V ( t ) B | | F 2 V = A T ( D A V ( t ) B ) B T .
The GNN evolutionary design is defined by the dynamic system
V ˙ ( t ) = d V ( t ) d t = γ ε ( t ) V , V ( 0 ) = V 0 ,
where γ > 0 is a real parameter used to speed up the convergence, and V ˙ ( t ) denotes the time derivative of V ( t ) . Thus, the linear GNN aimed at solving A X B = D is given by the following dynamics:
V ˙ ( t ) = γ A T ( D A V ( t ) B ) B T .
The dynamical flow (2) is denoted as GNN ( A , B , D ) . The nonlinear GNN ( A , B , D ) for solving A X B = D is defined by
V ˙ ( t ) = γ A T F ( D A V ( t ) B ) B T .
The function array F ( C ) = F ( [ c i j ] ) is based on the appropriate odd and monotonically increasing activation function, which is applicable to the elements of a real matrix C = ( c i j ) R m × n , i.e.,  F ( C ) = [ f ( c i j ) ] , i = 1 , , m , j = 1 , , n , .
Proposition 1 restates restrictions on the solvability of A X B = D and its general solution.
Proposition 1
([4,5]). If A R m × n , B R p × q and D R m × q , then the fulfillment of the condition
A A D B B = D
is necessary and sufficient for the solvability of the linear matrix equation A X B = D . In this case, the set of all solutions is given by
X = A D B + Y A A Y B B | Y R n × p .
The following results from [34] describe the conditions of convergence and the limit of the unknown matrix V ( t ) from (3) as t + .
Proposition 2
([34]). Suppose the matrices A R m × n , B R p × q and D R m × q satisfy (4). Then, the unknown matrix V ( t ) from (3) converges as t + with the equilibrium state
V ( t ) V ˜ = A D B + V ( 0 ) A A V ( 0 ) B B
for any initial state-variable matrix V ( 0 ) R n × p .
The research in [40] investigated various ZNN models based on optimization methods. The goal of the current research is to develop a GNN model based on the gradient E G ( t ) of E ( t ) F 2 instead of the original goal function E ( t ) .
The obtained results are summarized as follows:
  • A novel error function E G ( t ) is proposed for the development of the GNN dynamical evolution.
  • The GNN design based on the error function E G ( t ) is developed and analyzed theoretically and numerically.
  • A hybridization of GNN and ZNN dynamical systems based on the error matrix E G is proposed and investigated.
The overall organization of this paper is as follows. The motivation and derivation of the GGNN and GZNN models are presented in Section 2. Section 3 is dedicated to the convergence analysis of GGNN dynamics. A numerical comparison of GNN and GGNN dynamics is given in Section 4. Neural dynamics based on the hybridization of GGNN and GZNN models for solving matrix equations are considered in Section 6. Numerical examples of hybrid models are analyzed in Section 6. Finally, the last section presents some concluding remarks and a vision of further research.

2. Motivation and Derivation of GGNN and GZNN Models

The standard GNN design (2) solves the GLME A X B = D under constraint (4). Our goal is to resolve this restriction and propose dynamic evolutions based on error functions that tend to zero without restrictions.
Our goal is to define the GNN design for solving the GLME A X B = D based on the error function
E G ( t ) : = ε ( t ) = A T D A V ( t ) B B T = A T E ( t ) B T .
According to known results from nonlinear unconstrained optimization [41], the equilibrium points of (7) satisfy
E G ( t ) : = ε ( t ) = 0 .
We continue the investigation from [40]. More precisely, we develop the GNN model based on the error function E G ( t ) instead of the error function E ( t ) . In this way, new neural dynamics are aimed at forcing the gradient E G to zero instead of the standard goal function E ( t ) . It is reasonable to call such an RNN model a gradient-based GNN (abbreviated GGNN).
Proposition 3 gives the conditions for the solvability of the matrix equations E ( t ) = 0 and E G ( t ) = 0 and the general solutions to these systems.
Proposition 3
([40]). Consider the arbitrary matrices A R m × n , B R k × h and D R m × h . The following statements are true:
(a)
The equation E ( t ) = 0 is solvable if and only if (4) is satisfied, and the general solution to E ( t ) = 0 is given by (5).
(b)
The equation E G ( t ) = 0 is always solvable, and its general solution coincides with (5).
Proof. 
(a) This part of the proof follows from known results on the solvability and general solution of the matrix equation A X B = D of generalized inverses [4] (p. 52, Theorem 1) and its application to the matrix equation E ( t ) = 0 A V ( t ) B = D .
(b) According to [4] (p. 52, Theorem 1), the matrix equation
E G ( t ) = 0 A T A V B B T = A T D B T
is consistent if and only if
A T A A T A A T D B T B B T B B T = A T D B T
is satisfied. Indeed, applying the properties ( A T A ) A T = A , B T ( B B T ) = B and A T A A = A T , B B B T = B T of the Moore–Penrose inverse [5] results in
A T A A T A A T D B T B B T B B T = A T A A D B B B T = A T D B T .
In addition, based on [4] (p. 52, Theorem 1), the general solution V ( t ) to E G ( t ) = 0 is
V = A T A A T D B T B B T + Y A T A A T A Y B B T B B T = A D B + Y A A Y B B ,
which coincides with (5).    □
In this way, the matrix equation E ( t ) = 0 is solvable under condition (4), while the equation E G ( t ) = 0 is always consistent. In addition, the general solutions to equations E ( t ) = 0 and E G ( t ) = 0 are identical [40].
The next step is to define the GGNN dynamics using the error matrix E G ( t ) . Let us define the objective function ε G = | | E G | | F 2 / 2 , whose gradient is equal to
ε G ( V ( t ) ) V = | | A T ( D A V ( t ) B ) B T | | F 2 V = A T A A T ( D A V ( t ) B ) B T B B T .
The dynamical system for the GGNN formula is obtained by applying the GNN evolution along the gradient of ε G ( V ( t ) ) based on E G ( t ) , as follows:
V ˙ ( t ) = γ ε G V = γ A T A A T D A V ( t ) B B T B B T .
The nonlinear GGNN dynamics are defined as
V ˙ ( t ) = γ A T A F ( A T D A V ( t ) B B T ) B B T ,
in which F ( C ) = F ( [ c i j ] ) denotes the elementwise application of an odd and monotonically increasing function f ( · ) , as  mentioned in the previous section for the GNN model (3). Model (10) is termed GGNN ( A , B , D ) . Three activation functions f ( · ) are used in numerical experiments:
1.
Linear function
f l i n ( x ) = x ;
2.
Power-sigmoid activation function
f p s ( x , ρ , ϱ ) = x ρ if | x | 1 1 + e ϱ 1 e ϱ · 1 + e ϱ x 1 e ϱ x if | x | < 1
where ϱ > 2 , and ρ 3 is an odd integer;
3.
Smooth power-sigmoid function
f s p s ( x , ρ , ϱ ) = 1 2 x ρ + 1 + e ϱ 1 e ϱ · 1 + e ϱ x 1 e ϱ x ,
where ϱ > 2 , and ρ 3 is an odd integer.
Figure 1 represents the Simulink implementation of GGNN ( A , B , D ) dynamics (10).
On the other hand, the GZNN model, defined using the ZNN dynamics on the Zhangian matrix E G ( t ) , is defined in [40] by the general evolutionary design
E ˙ G ( t ) = d E G ( t ) d t = γ F ( E G ( t ) ) .

3. Convergence Analysis of GGNN Dynamics

In this section, we will analyze the convergence properties of the GGNN model given by dynamics (10).
Theorem 1.
Consider matrices A R m × n , B R p × q and D R m × q . If an odd and monotonically increasing array activation function F ( · ) based on an elementwise function f ( · ) is used, then the activation state matrix V ( t ) R n × p of the GGNN ( A , B , D ) model (10) asymptotically converges to the solution of the matrix equation A X B = D , i.e.,  A T A V ( t ) B B T A T D B T as t + , for an arbitrary initial state matrix V ( 0 ) .
Proof. 
From statement (b) of Proposition 3, the solvability of A T A V B B T = A T D B T is ensured. The substitution V ( t ) = V ¯ ( t ) + A D B transforms the dynamics (10) into
d V ¯ ( t ) d t = d V ( t ) d t = γ A T A F A T D A V ( t ) B B T B B T = γ A T A F A T D A V ¯ ( t ) B A A D B B B T B B T = ( 4 ) γ A T A F A T D A V ¯ ( t ) B D B T B B T = γ A T A F A T A V ¯ ( t ) B B T B B T .
The Lyapunov function candidate that measures the convergence performance is defined by
L V ¯ ( t ) , t = 1 2 | | V ¯ ( t ) | | F 2 = 1 2 Tr V ¯ ( t ) T V ¯ ( t ) .
The conclusion is L ( V ¯ ( t ) , t ) 0 . According to (16), assuming (15) and using d Tr ( X T X ) = 2 Tr ( X T d X ) , in conjunction with the basic properties of the matrix trace function, one can express the time derivative of L ( V ¯ ( t ) , t ) as follows:
d L ( V ¯ ( t ) , t ) d t = 1 2 d Tr V ¯ ( t ) T V ¯ ( t ) d t = 1 2 · 2 · Tr V ¯ ( t ) T d V ¯ ( t ) d t = Tr V ¯ ( t ) T γ A T A F A T A V ¯ ( t ) B B T B B T = γ Tr V ¯ ( t ) T A T A F A T A V ¯ ( t ) B B T B B T = γ Tr B B T V ¯ ( t ) T A T A F A T A V ¯ ( t ) B B T = γ Tr A T A V ¯ ( t ) B B T T F A T A V ¯ ( t ) B B T .
Since the scalar-valued function f ( · ) is odd and monotonically increasing, it follows that, for W ( t ) = A T A V ¯ ( t ) B B T ,
d L ( V ¯ ( t ) , t ) d t = γ Tr ( W T F ( W ) ) = γ i = 1 m j = 1 n w i j f ( w i j ) < 0 if W ( t ) : = A T A V ¯ ( t ) B B T 0 = 0 if W ( t ) : = A T A V ¯ ( t ) B B T = 0 ,
which implies
d L ( V ¯ ( t ) , t ) d t < 0 if W ( t ) 0 = 0 if W ( t ) = 0 .
Observing the identity
W ( t ) = A T A V ¯ ( t ) B B T = A T A V ( t ) A D B B B T = A T A V ( t ) B B T A T D B T = A T A V ( t ) B D B T ,
and using the Lyapunov stability theory, W ( t ) : = A T A V ( t ) B D B T globally converges to the zero matrix from an arbitrary initial value V ( 0 ) .    □
Theorem 2.
The activation state-variable matrix V ( t ) of the model GGNN ( A , B , D ) , defined by (10), is convergent as t + , and its equilibrium state is
V ( t ) V ˜ ( t ) = A D B + V ( 0 ) A A V ( 0 ) B B
for every initial state matrix V ( 0 ) R n × p .
Proof. 
From (10), the matrix V 1 ( t ) = ( A T A ) A T A V ( t ) B B T ( B B T ) satisfies
d V 1 ( t ) d t = ( A T A ) A T A d V ( t ) d t B B T ( B B T ) = γ ( A T A ) A T A A T A A T ( D A V ( t ) B ) B T B B T B B T ( B B T ) .
According to the basic properties of the Moore–Penrose inverse [5], it follows that
( B B T ) T B B T ( B B T ) = ( B B T ) T = B B T , ( A T A ) A T A ( A T A ) T = ( A T A ) T = A T A
which further implies
d V 1 ( t ) d t = γ A T A A T ( D A V ( t ) B ) B T B B T = d V ( t ) d t .
Consequently, V 2 ( t ) = V ( t ) V 1 ( t ) satisfies d V 2 ( t ) d t = d V ( t ) d t d V 1 ( t ) d t = 0 , which implies
V 2 ( t ) = V 2 ( 0 ) = V ( 0 ) V 1 ( 0 ) = V ( 0 ) ( A T A ) A T A V ( 0 ) B B T ( B B T ) = V ( 0 ) A A V ( 0 ) B B , t 0 .
Furthermore, from Theorem 1, A T A V ( t ) B B T A T D B T , and V 1 ( t ) converges to
V 1 ( t ) = ( A T A ) A T A V ( t ) B B T ( B B T ) ( A T A ) A T D B T ( B B T ) = A D B
as t + . Therefore, V ( t ) = V 1 ( t ) + V 2 ( t ) converges to the equilibrium state
V ˜ ( t ) = A D B + V 2 ( t ) = A D B + V ( 0 ) A A V ( 0 ) B B .
The proof is finished.    □

4. Numerical Experiments on GNN and GGNN Dynamics

The numerical examples in this section are based on the Simulink implementation of the GGNN formula in Figure 1.
The parameter γ , initial state V ( 0 ) and parameters ρ and ϱ of the nonlinear activation functions (12) and (13) are entered directly into the model, while matrices A, B and D are defined from the workspace. It is assumed that ρ = ϱ = 3 in all examples. The ode15s differential equation solver is used in the configuration parameters. In all examples, V * denotes the theoretical solution.
The blocks powersig, smoothpowersig and transpmult include the codes described in [34,42].
Example 1.
Let us consider the idempotent matrix A from [43,44],
A = 1 0 1 1 0 1 1 2 0 0 0 0 0 0 0 0
of rank ( A ) = 2 , and the theoretical Moore–Penrose inverse
V * = A = 1 3 2 1 0 0 1 1 0 0 1 0 0 0 0 1 0 0 .
The matrix equation corresponding to the Moore–Penrose inverse is A T A X = A T [16], which implies the error function E ( t ) = A T ( I A X ) . The corresponding GNN model is defined by GNN ( A T A , I 4 , A T ) , where I 4 denotes the identity and zero 4 × 4 matrix. Constraint (4) reduces to the condition A A A T = A T , which is not satisfied. The input parameters of GNN ( A T A , I 4 , A T ) are γ = 10 8 , V ( 0 ) = O 4 , where O 4 denotes the zero 4 × 4 matrix. The corresponding GGNN ( ( A T A ) 2 , I , A T A A T ) design is based on the error matrix E G ( t ) = A T A A T I A V . The Simulink implementation of GGNN ( A , B , D ) from Figure 1 and the Simulink implementation of GNN ( A , B , D ) from [34] export, in this case, the graphical results presented in Figure 2 and Figure 3, which display the behaviors of the norms | | E G ( t ) | | F = | | A T A A T ( I A V ( t ) ) | | F and | | V ( t ) V * | | F , respectively. It is observable that the norms generated by the application of the GGNN formula vanish faster to zero than the corresponding norms in the GNN model. The graphs in the presented figures strengthen the fast convergence of the GGNN dynamical system and its important role, which can include the application of this specific model (10) to problems that require the computation of the Moore–Penrose inverse.
Example 2.
Let us consider the matrices
A = 8 8 4 11 4 7 1 4 3 0 12 10 6 12 12 , B = 1 0 0 0 1 0 0 0 1 0 0 0 , D = 84 2524 304 2252 623 2897 484 885 701 1894 2278 2652 2778 1524 3750 .
The exact minimum-norm least-squares solution is
V * = A D B = 7409 65 9564 65 8953 65 0 968 13 1770 13 1402 13 0 6503 65 4187 65 8826 65 0 .
The ranks of the input matrices are equal to r = rank ( A ) = 2 , rank ( D ) = 2 and rank ( B ) = 3 . Constraint (4) is satisfied in this case. The linear GGNN ( A , B , D ) formula (10) is applied to solve the matrix equation A X B = D . The gain parameter of the model is γ = 10 9 , V ( 0 ) = 0 , and the stopping time is t = 0.00001 , which gives
X = 113.9846 147.1385 137.7385 0 74.4615 136.1538 107.8462 0 100.0462 64.4154 135.7846 0 A D B .
The elementwise trajectories of the state variables v i j of the state matrix V ( t ) are shown in Figure 4a–c with solid red lines for linear, power-sigmoid and smooth power-sigmoid activation functions, respectively. The fast convergence of elementwise trajectories to the corresponding black dashed trajectories of the theoretical solution V * is notable. In addition, faster convergence caused by the nonlinear AFs f p s and f s p s is noticeable in Figure 4b,c. The trajectories in the figures indicate the usual convergence behavior, so the system is globally asymptotically stable. The norms of the error matrix E G of both models GNN and GGNN under linear and nonlinear AFs are shown in Figure 5a–c. The power-sigmoid and smooth power-sigmoid activation functions show superiority in their convergence speed compared with linear activation. On each graph in Figure 5a–c, the Frobenius norm E G ( t ) F of the error matrix E G ( t ) in the GGNN formula vanishes faster to zero than that in the GNN model. Moreover, in each graph in Figure 6a–c, the Frobenius norm E ( t ) F in the GGNN formula vanishes faster to zero than that in the GNN model, which strengthens the fact that the proposed dynamical system (10) initiates accelerated convergence compared to (3).
All graphs shown in Figure 5 and Figure 6 confirm the applicability of the proposed GGNN design compared to the traditional GNN design, even if constraint (4) holds.
Example 3.
Let us explore the behavior of GNN and GGNN dynamics for computing the Moore–Penrose inverse of the matrix
A = 9 3 3 1 1 0 4 7 2 2 4 4 13 5 8 .
The Moore–Penrose inverse of A is equal to
A = 9908 127779 18037 766674 6874 127779 2663 383337 29941 766674 5690 127779 14426 383337 16741 127779 25130 383337 6392 383337 3517 42593 1979 255558 1073 42593 7373 127779 15049 255558 0.0775 0.0235 0.0538 0.0069 0.0390 0.0445 0.0376 0.1310 0.0655 0.0167 0.0826 0.0077 0.0252 0.0577 0.0589 .
The rank of the input matrix is equal to r = rank ( A ) = 3 . Consequently, the matrix A is left invertible and satisfies A A = I . The error matrix E ( t ) = I V A initiates the GNN ( I , A , I ) dynamics for computing A . The gradient-based error matrix
E G ( t ) = I V ( t ) A A T .
initiates the GGNN ( I , A A T , A T ) design.
The gain parameter of the model is γ = 100 , and the initial state is V ( 0 ) = 0 with a stop time t = 0.00001 .
The Frobenius norms of the error matrix E ( t ) generated by the linear GNN and GGNN models for different values of γ ( γ = 10 2 , γ = 10 3 , γ = 10 6 ) are shown in Figure 7a–c. The graphs in these figures confirm an increase in the convergence speed, which is caused by the increase in the gain parameter γ. Because of that, the considered time intervals are [ 0 , 10 2 ] , [ 0 , 10 3 ] and [ 0 , 10 6 ] , respectively. In all three scenarios, a faster convergence of the GGNN model is observable compared to the GNN design. The values of the norm E G F generated by both the GNN and GGNN models with linear and two nonlinear activation functions are shown in Figure 8a–c. Like the conclusion in the previous example, the perception is that the GGNN converges faster compared to the GNN model.
In addition, the graphs in Figure 8b,c, corresponding to the power-sigmoid and smooth power-sigmoid AFs, respectively, show a certain level of instability in convergence, as well as an increase in the value of E G ( t ) F .
Example 4.
Consider the matrices
A = 15 352 45 238 42 5 14 8 132 65 235 65 44 350 73 , D = 4 4 16 3 1 9 1 7 2 2 2 4 4 1 5 , A 1 = D A ,
which dissatisfy rank ( A 1 ) = rank ( D ) = 3 . Now, we apply the GNN and GGNN formulae to solve the matrix equation A 1 X = D . The standard error function is defined as E ( t ) = D A 1 V ( t ) . So, we consider GNN ( A 1 , I 3 , D ) . The error matrix for the corresponding GGNN model is E G ( t ) = A 1 T ( D A 1 V ( t ) ) , which initiates the GGNN ( A 1 T A 1 , I 3 , A 1 T D ) flow. The gain parameter of the model is γ = 10 9 , and the final time is t = 0.00001 . The zero initial state V ( 0 ) = 0 generates the best approximate solution X = A 1 D = ( D A ) D of the matrix equation A 1 X = D , given by
X = A 1 D = 133851170015 180355524917879 1648342203725 180355524917879 608888775010 180355524917879 508349079720 180355524917879 691967699675 180355524917879 48398092277 180355524917879 68130232042 180355524917879 242513061343 180355524917879 82710890618 180355524917879 31936168532 180355524917879 727110260384 180355524917879 134047117682 180355524917879 172434574901 180355524917879 1350198643304 180355524917879 225136761416 180355524917879 0.000742 0.00914 0.00338 0.00282 0.00384 0.000268 0.000378 0.00134 0.000459 0.000177 0.00403 0.000743 0.000956 0.00749 0.00125 .
The Frobenius norms of the error matrix E ( t ) = D A 1 V ( t ) B in the GNN and GGNN models for both linear and nonlinear activation functions are shown in Figure 9a–c, and the error matrix E G ( t ) = A 1 T ( D A 1 V ( t ) ) in both models for linear and nonlinear activation functions are shown in Figure 10a–c. It is observable that the GGNN converges faster than GNN.
Example 5.
Table 1 and Table 2 show the results obtained during experiments we conducted with nonsquared matrices, where m × n is the dimension of the matrix. Table 1 lists the input data that were used to perform experiments with the Simulink model and generated the results in Table 2. The best cases in Table 2 are marked in bold text.
The numerical results arranged in Table 2 are divided into two parts by a horizontal line. The upper part corresponds to the test matrices of dimensions 10 , while the lower part corresponds to the dimensions m , n 10 . Considering the first two columns, it is observable from the upper part that the GGNN generates smaller values | | E ( t ) | | F compared to the GGNN. The values of | | E ( t ) | | F in the lower part generated by the GNN and GGNN are equal. Considering the third and fourth columns, it is observable from the upper part that the GGNN generates smaller values | | E G ( t ) | | F compared to the GGNN. On the other hand, the values of | | E G ( t ) | | F in the lower part, generated by the GGNN, are smaller than the corresponding values generated by the GNN. The last two columns show that the GGNN requires less CPU time compared to the GNN. The general conclusion is that the GGNN model is more efficient in rank-deficient test matrices of larger order m , n 10 .

5. Mixed GGNN-GZNN Model for Solving Matrix Equations

The gradient-based error matrix for solving the matrix equation A X = B is defined by
E G A , I , B ( t ) = A T A V ( t ) B .
The GZNN design (14) corresponding to the error matrix E A , I , B , designated GZNN ( A , I , B ) , is of the form:
E ˙ G A , I , B ( t ) = γ F A T A V ( t ) B .
Now, the scalar-valued norm-based error function corresponding to E G A , I , B ( t ) is given by
ε ( t ) = ε V ( t ) = 1 2 | | E G A , I , B ( t ) | | F = | | A T A V ( t ) B | | F 2 .
The following dynamic state equation can be derived using the GGNN ( A , I , B ) design formula based on (10):
V ˙ ( t ) = γ A T A F A T A V ( t ) B .
Further, using a combination of E ˙ G A , I , B ( t ) = A T A V ˙ ( t ) and the GNN dynamics (23), it follows that
E ˙ G A , I , B ( t ) = A T A V ˙ ( t ) = γ A T A A T A F A T A V ( t ) B .
The next step is to define the new hybrid model based on the summation of the right-hand sides in (22) and (24), as follows:
E ˙ G A , I , B ( t ) = γ A T A 2 + I F A T A V ( t ) B .
The model (25) is derived from the combination of the model GGNN ( A , I , B ) and the model GZNN ( A , I , B ) . Hence, it is equally justified to use the term Hybrid GGNN (abbreviated HGGNN) and Hybrid GZNN (abbreviated HGZNN) model. But model (25) is implicit, so it is not a type of GGNN dynamics. On the other hand, it is designed for time-invariant matrices, which is not in accordance with the common nature of GZNN models, because usually, the GZNN is used in the time-varying case. A formal comparison of (25) and GZNN ( A , I , B ) reveals that both these methods possess identical left-hand sides, and the right-hand side of (25) can be derived by multiplying the right-hand side of GZNN ( A , I , B ) by the term A T A 2 + I .
Formally, (25) is closer to GZNN dynamics, so we will denote the model (25) by HGZNN ( A , I , B ) , considering that this model is not the exact GZNN neural dynamics and is applicable to time-invariant case. This is the case of the constant coefficient matrices A, I and B. Figure 11 represents the Simulink implementation of HGZNN ( A , I , B ) dynamics (25).
Now, we will take into account the process of solving the matrix equation X C = D . The error matrix for this equation is defined by
E G I , C , D ( t ) = V ( t ) C D C T .
The GZNN design (14) corresponding to the error matrix E I , C , D , denoted by GZNN ( I , C , D ) , is of the form:
E ˙ G I , C , D ( t ) = V ˙ C C T = γ F V ( t ) C D C T .
On the other hand, the GGNN design formula (10) produces the following dynamic state equation:
V ˙ ( t ) = γ F ( V ( t ) C D ) C T C C T , V ( 0 ) = V 0 .
The GGNN model (27) is denoted by GGNN ( I , C , D ) . It implies
E ˙ G I , C , D ( t ) = V ˙ ( t ) C C T = γ F ( V ( t ) C D ) C T C C T C C T .
A new hybrid model based on the summation of the right-hand sides in (26) and (28) can be proposed as follows:
E ˙ G I , C , D ( t ) = γ F ( V ( t ) C D ) C T I + C C T 2 .
The Model (29) will be denoted by HGZNN ( I , C , D ) . This is the case with the constant coefficient matrices I, C and D.
For the purposes of the proof of the following results, we will use E C R ( M ) to denote the exponential convergence rate of the model M . With λ min ( K ) and λ max ( K ) , we denote the smallest and largest eigenvalues of the matrix K, respectively. Continuing the previous work, we use three types of activation functions F : linear, power-sigmoid and smooth power-sigmoid.
The following theorem determines the equilibrium state of HGZNN ( A , I , B ) and defines its global exponential convergence.
Theorem 3.
Let A R k × n , B R k × m be given and satisfy A A B = B , and let V ( t ) R n × m be the state matrix of (25), where F is defined by f l i n , f p s or f s p s .
(a)
Then, V ( t ) achieves global convergence and satisfies A V ( t ) B when t + , starting from any initial state X ( 0 ) R n × m . The state matrix V ( t ) R n × m of HGZNN ( A , I , B ) is stable in the sense of Lyapunov.
(b)
The exponential convergence rate of the HGZNN ( A , I , B ) model (25) in the linear case is equal to
E C R ( HGZNN ( A , I , B ) ) = γ 1 + σ min 4 ( A ) ,
where σ min ( A ) = λ min ( A T A ) is the minimum singular value of A.
(c)
The activation state variable matrix V ( t ) of the model HGZNN ( A , I , B ) is convergent when t + with the equilibrium state matrix
V ( t ) V ˜ V ( 0 ) = A B + ( I A A ) V ( 0 ) .
Proof. 
(a) The assumption A A B = B provides the solvability of the matrix equation A X = B .
The appropriate Lyapunov function is defined as
L ( t ) = 1 2 | | E G A , I , B ( t ) | | F 2 = 1 2 Tr E G A , I , B ( t ) T E G A , I , B ( t ) .
Hence, from (25) and d Tr ( V T V ) = 2 Tr ( V T d V ) , it holds that
L ˙ ( t ) = 1 2 d d t Tr E G A , I , B ( t ) T E G A , I , B ( t ) = Tr E G A , I , B ( t ) T E ˙ A , I , B ( t ) = Tr E G A , I , B ( t ) T γ A T A 2 + I F E G A , I , B ( t ) = γ Tr A T A 2 + I F E G A , I , B ( t ) E G A , I , B ( t ) T .
According to similar results from [45], one can verify the following inequality:
L ˙ ( t ) γ Tr A T A 2 + I E G A , I , B ( t ) E G A , I , B ( t ) T .
We also consider the following inequality from [46], which is valid for a real symmetric matrix K and a real symmetric positive-semidefinite matrix L of the same size:
λ min ( K ) Tr ( L ) Tr ( K L ) λ max ( K ) Tr ( L ) .
Now, the following can be chosen: K = A T A 2 + I and L = E G A , I , B ( t ) E G A , I , B ( t ) T . Consider λ min A T A 2 = λ min 2 A T A = σ min 4 ( A ) , where λ min ( A ) is the minimum eigenvalue of A, and σ min ( A ) = λ min ( A T A ) is the minimum singular value of A. Then, 1 + σ min 4 ( A ) 1 is the minimum nonzero eigenvalue of A T A 2 + I , which implies
L ˙ ( t ) γ 1 + σ min 4 ( A ) Tr E G A , I , B ( t ) E G A , I , B ( t ) T .
From (33), it can be concluded
L ˙ ( t ) < 0 if E G A , I , B ( t ) 0 = 0 if E G A , I , B ( t ) = 0 .
According to (34), the Lyapunov stability theory confirms that E A , I , B ( t ) = A V ( t ) B = 0 is a globally asymptotically stable equilibrium point of the HGZNN ( A , I , B ) model (25). So, E A , I , B ( t ) converges to the zero matrix, i.e., A V ( t ) B , from any initial state X ( 0 ) .
(b)
From (a), it follows that
L ˙ γ 1 + σ min 4 ( A ) Tr E G A , I , B ( t ) T E G A , I , B ( t ) = γ 1 + σ min 4 ( A ) | | E G A , I , B ( t ) | | F 2 = γ 2 1 + σ min 4 ( A ) L ( t ) .
This implies
L L ( 0 ) e γ 1 + σ min 4 ( A ) t | | E G A , I , B ( t ) | | F 2 | | E G A , I , B ( 0 ) | | F 2 e γ 1 + σ min 4 ( A ) | | E G A , I , B ( t ) | | F | | E G A , I , B ( 0 ) | | F e γ / 2 1 + σ min 4 ( A ) ,
which confirms the convergence rate (30) of HGZNN ( A , I , B ) .
(c)
This part of the proof can be verified with the particular case B : = I , D : = B of Theorem 2.
Theorem 4.
Let C R m × l , D R n × l be given and satisfy D C C = D , and let V ( t ) R n × m be the state matrix of (29), where F is defined by f l i n , f p s or f s p s .
(a)
Then, V ( t ) achieves global convergence V ( t ) C D when t + , starting from any initial state V ( 0 ) R n × m . The state matrix V ( t ) R n × m of HGZNN ( I , C , D ) is stable in the sense of Lyapunov.
(b)
The exponential convergence rate of the HGZNN ( I , C , D ) model (29) in the linear case is equal to
E C R ( HGZNN ( I , C , D ) ) = γ 1 + σ m i n 4 ( C ) .
(c)
The activation state variable matrix V ( t ) of the model HGZNN ( I , C , D ) is convergent when t + with the equilibrium state matrix
V ( t ) V ˜ V ( 0 ) = D C + V ( 0 ) ( I C C ) .
Proof. 
(a) The assumption D C C = D ensures the solvability of the matrix equation X C = D .
Let us define the Lyapunov function by
L ( t ) = 1 2 | | E G I , C , D ( t ) | | F 2 = 1 2 Tr E G I , C , D ( t ) T E G I , C , D ( t ) .
Hence, from (29) and d Tr ( X T X ) = 2 Tr ( X T d X ) , it holds that
L ˙ ( t ) = 1 2 d d t Tr E G I , C , D ( t ) T E G I , C , D ( t ) = Tr E G I , C , D ( t ) T E ˙ G I , C , D ( t ) = Tr E G I , C , D ( t ) T γ C C T 2 + I F E G I , C , D ( t ) = γ Tr C C T 2 + I F E G I , C , D ( t ) E G I , C , D ( t ) T .
Following the principles from [45], one can verify the following inequality:
L ˙ ( t ) γ Tr C C T 2 + I E G I , C , D ( t ) E G I , C , D ( t ) T .
Consider the inequality (32) with the particular settings K = C C T 2 + I , L = E G I , C , D ( t ) E G I , C , D ( t ) T . Let λ min C C T 2 be the minimum eigenvalue of C C T 2 . Then, 1 + σ min 4 ( C ) ) 1 is the minimal nonzero eigenvalue of C C T 2 + I , which implies
L ˙ ( t ) γ 1 + σ min 4 ( C ) Tr E G I , C , D ( t ) E G I , C , D ( t ) T .
From (37), it can be concluded
L ˙ ( t ) < 0 if E G I , C , D ( t ) 0 = 0 if E G I , C , D ( t ) = 0 .
According to (38), the Lyapunov stability theory confirms that E G I , C , D ( t ) = V ( t ) C D = 0 is a globally asymptotically stable equilibrium point of the HGZNN ( A , I , B ) model (29). So, E G I , C , D ( t ) converges to the zero matrix, i.e., V ( t ) C D , from any initial state V ( 0 ) .
(b)
From (a), it follows
L ˙ γ 1 + σ min 4 ( C ) Tr E G I , C , D ( t ) T E G I , C , D ( t ) = γ 1 + σ min 4 ( C ) | | E G I , C , D ( t ) | | F 2 = γ 2 1 + σ min 4 ( C ) L ( t ) .
This implies
L L ( 0 ) e 2 γ 1 + σ min 4 ( C ) t | | E G I , C , D ( t ) | | F 2 | | E G I , C , D ( 0 ) | | F 2 e 2 γ 1 + σ min 4 ( C ) | | E G I , C , D ( t ) | | F | | E G I , C , D ( 0 ) | | F e γ 1 + σ min 4 ( C ) ,
which confirms the convergence rate (35) of HGZNN ( I , C , D ) .
(c)
This part of the proof can be verified with the particular case A : = I , B : = C of Theorem 2.
Corollary 1.
(a) Let the matrices A R k × n , B R k × m be given and satisfy A A B = B , and let V ( t ) R n × m be the state matrix of (25), with an arbitrary nonlinear activation F . Then, E C R ( GZNN ( A , I , B ) ) = γ and E C R ( GGNN ( A , I , B ) ) = γ σ min ( A ) .
  • (b) Let the matrices C R m × l , D R n × l be given and satisfy D C C = D , and let V ( t ) R n × m be the state matrix of (29) with an arbitrary nonlinear activation F . Then, E C R ( GZNN ( I , C , D ) ) = γ and E C R ( GGNN ( I , C , D ) ) = γ σ min ( C ) .
From Theorem 3 and Corollary 1(a), it follows that
E C R ( HGZNN ( A , I , B ) ) E C R ( GZNN ( A , I , B ) ) = 1 + σ min 4 ( A ) 1 .
E C R ( HGZNN ( A , I , B ) ) E C R ( GGNN ( A , I , B ) ) = 1 + σ min 4 ( A ) σ min 2 ( A ) > 1 .
E C R ( GZNN ( A , I , B ) ) E C R ( GGNN ( A , I , B ) ) = 1 σ min 2 ( A ) < 1 , σ min ( A ) > 1 1 , σ min ( A ) 1 .
Similarly, according to Theorem 4 and Corollary 1(b), it can be concluded that
E C R ( HGZNN ( I , C , D ) ) E C R ( GZNN ( I , C , D ) ) = 1 + σ min 4 ( C ) 1 .
E C R ( HGZNN ( I , C , D ) ) E C R ( GGNN ( I , C , D ) ) = 1 + σ min 4 ( C ) σ min 2 ( C ) > 1 .
E C R ( GZNN ( I , C , D ) ) E C R ( GGNN ( I , C , D ) ) = 1 σ min 2 ( C ) < 1 , σ min ( C ) > 1 1 , σ min ( C ) 1 .
Remark 1. (a) According to (40), it follows that E C R ( HGZNN ( A , I , B ) ) > E C R ( GZNN ( A , I , B ) ) . According to (39), it is obtained
E C R ( HGZNN ( A , I , B ) ) = E C R ( GZNN ( A , I , B ) ) , σ min ( A ) = 0 > E C R ( GZNN ( A , I , B ) ) , σ min ( A ) > 0 .
According to (41), it follows
E C R ( GZNN ) ( A , I , B ) < E C R ( GGNN ( A , I , B ) ) , σ min ( A ) > 1 E C R ( GGNN ( A , I , B ) ) , σ min ( A ) 1 .
As a result, the following conclusions follow:
-
HGZNN ( A , I , B ) is always faster than GGNN ( A , I , B ) ;
-
HGZNN ( A , I , B ) is faster than GZNN ( A , I , B ) in the case where σ min ( A ) > 0 ;
-
GZNN ( A , I , B ) is faster than GGNN ( A , I , B ) ) in the case where σ min ( A ) < 1 .
(b) According to (43), it follows that E C R ( HGZNN ( I , C , D ) ) > E C R ( GZNN ( I , C , D ) ) . According to (42), it follows that
E C R ( HGZNN ( I , C , D ) ) = E C R ( GZNN ( I , C , D ) ) , σ min ( C ) = 0 > E C R ( GZNN ( I , C , D ) ) , σ min ( C ) > 0 .
According to (41) and (44), it can be verified
E C R ( GZNN ) ( I , C , D ) < E C R ( GGNN ( I , C , D ) ) , σ min ( C ) > 1 E C R ( GGNN ( I , C , D ) ) , σ min ( C ) 1 .
As a result, the following conclusions follow:
-
HGZNN ( I , C , D ) is always faster than GGNN ( I , C , D ) ;
-
HGZNN ( I , C , D ) is faster than GZNN ( I , C , D ) in the case where σ min ( C ) > 0 ;
-
GZNN ( I , C , D ) is faster than GGNN ( I , C , D ) ) in the case where σ min ( C ) < 1 .
Remark 2.
The particular HGZNN ( A T A , I , A T ) and GGNN ( A T A , I , A T ) designs define the corresponding modifications of the improved GNN design proposed in [26] if A T A is invertible. In the dual case, HGZNN ( I , C C T , C T ) and GGNN ( I , C C T , C T ) define the corresponding modifications of the improved GNN design proposed in [26] if C C T is invertible.

Regularized HGZNN Model for Solving Matrix Equations

The convergence of HGZNN ( A , I , B ) (resp. HGZNN ( I , C , D ) ), as well as GGNN ( A , I , B ) (resp. GGNN ( I , C , D ) ), can be improved in the case where σ min ( A ) > 0 (resp. σ min ( C ) > 0 ). There exist two possible situations when the acceleration terms A T A and C C T improve the convergence. The first case assumes the invertibility of A (resp. C), and the second case assumes the left invertibility of A (resp. right invertibility of C). Still, in some situations, the matrices A and C could be rank-deficient. Hence, in the case where A and C are square and singular, it is useful to use the invertible matrices A 1 : = A + λ I and C 1 : = C + λ I , λ > 0 instead of A and C and to consider the models HGZNN ( A 1 , I , B ) and HGZNN ( I , C 1 , D ) . The following presents the convergence results considering the nonsingularity of A 1 and C 1 .
Corollary 2.
Let A R n × n , B R n × m be given and V ( t ) R n × m be the state matrix of (25), where F is defined by f l i n , f p s or f s p s . Let λ > 0 be a selected real number. Then, the following statements are valid:
(a)
The state matrix V ( t ) R r n × m of the model HGZNN ( A 1 , I , B ) converges globally to
V ˜ V ( 0 ) = A 1 1 B ,
when t + , starting from any initial state X ( 0 ) R n × m , and the solution is stable in the sense of Lyapunov.
(b)
The exponential convergence rate of HGZNN ( A 1 , I , B ) in the case where F = I is equal to
E C R HGZNN ( A 1 , I , B ) = γ 1 + σ min 4 ( A + λ I ) .
(c)
Let V ˜ V ( 0 ) be the limiting value of V ( t ) when t + . Then,
lim λ 0 V ˜ V ( 0 ) = lim λ 0 A + λ I 1 B .
Proof. 
Since A + λ I is invertible, it follows that V = A + λ I 1 B .
From (31) and the invertibility of A + λ I , we conclude the validity of (a). In this case, it follows that
V ˜ V ( 0 ) = ( A + λ I ) 1 B + ( I ( A + λ I ) 1 ( A + λ I ) ) V ( 0 ) = ( A + λ I ) 1 B + ( I I ) V ( 0 ) = ( A + λ I ) 1 B .
The part (b) is proved analogously to the proof of Theorem 3. The last part (c) follows from (a). □
Corollary 3.
Let C R m × m , D R n × m be given and V ( t ) R n × m be the state matrix of (29), where F = I , F = F p s or F = F s p s . Let λ > 0 be a selected real number. Then, the following statements are valid:
(a)
The state matrix V ( t ) R r n × m of HGZNN ( I , C 1 , D ) converges globally to
V ˜ V ( 0 ) = D ( C + λ I ) 1 ,
when t + , starting from any initial state X ( 0 ) R n × m , and the solution is stable in the sense of Lyapunov.
(b)
The exponential convergence rate of HGZNN ( I , C 1 , D ) in the case where F = I is equal to
E C R HGZNN ( I , C 1 , D ) = γ 1 + σ min 4 ( 1 ) .
(c)
Let V ˜ V ( 0 ) be the limiting value of V ( t ) when t + . Then,
lim λ 0 V ˜ V ( 0 ) = lim λ 0 D C + λ I 1 .
Proof. 
It can be proved analogously to Corollary 2. □
Remark 3.
(a) According to (40), it can be concluded that
E C R ( HGZNN ( A 1 , I , B ) ) > E C R ( GZNN ( A 1 , I , B ) ) .
Based on (39) it can be concluded
E C R ( HGZNN ( A 1 , I , B ) ) > E C R ( GZNN ( A 1 , I , B ) ) .
According to (41), one concludes
E C R ( GZNN ( A 1 , I , B ) ) < E C R ( GGNN ( A 1 , I , B ) ) .
(b) According to (43), it can be concluded
E C R ( HGZNN ( I , C 1 , D ) ) > E C R ( GZNN ( I , C 1 , D ) ) .
According to (42), it follows
E C R ( HGZNN ( I , C 1 , D ) ) > E C R ( GZNN ( I , C 1 , D ) ) .
Based on (41) and (44), it can be concluded
E C R ( GZNN ( I , C 1 , D ) ) < E C R ( GGNN ( I , C 1 , D ) ) .

6. Numerical Examples on Hybrid Models

In this section, numerical examples are presented based on the Simulink implementation of the HGZNN formula. The previously mentioned three types of activation functions f ( · ) in (11), (12) and (13) will be used in the following examples. The parameters γ , the initial state V ( 0 ) and the parameters ρ and ϱ of the nonlinear activation functions (12) and (13) are entered directly into the model, while the matrices A, B, C and D are defined from the workspace. We assume that ρ = ϱ = 3 in all examples. The ordinary differential equation solver in the configuration parameters is ode15s.
We present numerical examples in which we compare Frobenius norms | | E G | | F and | | A 1 B V ( t ) | | F , which are generated by HGZNN, GZNN and GGNN.
Example 6.
Consider the matrix
A = 0.49 0.276 0.498 0.751 0.959 0.446 0.68 0.96 0.255 0.547 0.646 0.655 0.34 0.506 0.139 0.71 0.163 0.585 0.699 0.149 0.755 0.119 0.224 0.891 0.258 .
In this example, we compare the HGZNN ( A , I , I ) model with GZNN ( A , I , I ) and GGNN ( A , I , I ) , considering all three types of activation functions. The gain parameter of the model is γ = 10 6 , the initial state V ( 0 ) = 0 , and the final time is t = 0.00001 .
The Frobenius norm of the error matrix E G in the HGZNN, GZNN and GGNN models for both linear and nonlinear activation functions are shown in Figure 12a–c, and the error matrices A 1 B V ( t ) of both models for linear and nonlinear activation functions are shown in Figure 13a–c. On each graph, the Frobenius norm of the error from the HGZNN formula vanishes faster to zero than those from the GZNN and GGNN models.
Example 7.
Consider the matrices
A = 0.0818 0.0973 0.0083 0.0060 0.0292 0.0372 0.0818 0.0649 0.0133 0.0399 0.0432 0.0198 0.0722 0.0800 0.0173 0.0527 0.0015 0.0490 0.0150 0.0454 0.0391 0.0417 0.0984 0.0339 0.0660 0.0432 0.0831 0.0657 0.0167 0.0952 0.0519 0.0825 0.0803 0.0628 0.0106 0.0920 , B = 0.1649 0.1813 0.0851 0.1197 0.0138 0.1437 0.1558 0.1965 0.1759 0.0625 0.0942 0.0639 0.1937 0.0847 0.1460 0.1636 0.0323 0.1392 0.1062 0.1063 0.0182 0.0688 0.0521 0.0358 0.1400 0.1309 0.0650 0.0533 0.1168 0.1189 0.0846 0.1277 0.0815 0.0211 0.0307 0.0216 0.0045 0.0188 0.0067 0.1640 0.1222 0.0562 .
In this example, we compare the HGZNN ( A , I , B ) model with GZNN ( A , I , B ) and GGNN ( A , I , B ) , considering all three types of activation functions. The gain parameter of the model is γ = 1000 , the initial state V ( 0 ) = 0 , and the final time is t = 0.01 .
The elementwise trajectories of the state variable are shown with red lines in Figure 14a–c, for linear, power-sigmoid and smooth power-sigmoid activation functions, respectively. The solid red lines corresponding to HGZNN ( A , I , B ) converge to the black dashed lines of the theoretical solution X. It is observable that the trajectories indicate the usual convergence behavior, so the system is globally asymptotically stable. The error matrices E G of the HGZNN, GZNN and GGNN models for both linear and nonlinear activation functions are shown in Figure 15a–c, and the residual matrices A 1 B X ( t ) of both models for linear and nonlinear activation functions are shown in Figure 16a–c. In each graph, for both error cases, the Frobenius norm of the error of the HGZNN formula is similar to the Frobenius norm of the error of the GZNN model, and they both converges faster to zero than the GGNN model.
Remark 4.
In this remark, we analyze the answer to the question, “how are the system parameters selected to obtain better performance?” The answer is complex and consists of several parts.
1.
The gain parameter γ is the parameter with the most influence on the behavior of the observed dynamic systems. The general rule is “the parameter γ should be selected as large as possible”. The numerical confirmation of this fact is investigated in Figure 7.
2.
The influence of γ and AFs is indisputable. The larger the value of γ, the faster the convergence. And, clearly, AFs increase convergence compared to the linear models. In the presented numerical examples, we investigate the influence of three AFs: linear, power-sigmoid and smooth power-sigmoid.
3.
The right question is as follows: what makes the GGNN better than the GNN under fair conditions that assume an identical environment during testing? Numerical experiments show better performance of the GGNN design compared to the GNN with respect to all three tested criteria: E ( t ) F , E G ( t ) F and V ( t ) V * F . Moreover, Table 2 in Example 5 is aimed at convergence analysis. The general conclusion from the numerical data arranged in Table 2 is that the GGNN model is more efficient compared to the GNN in rank-deficient test matrices of larger order m , n 10 .
4.
The convergence rate of the linear hybrid model HGZNN ( A , I , B ) ) depends on γ and the singular value σ min ( A ) , while the convergence rate of the hybrid model HGZNN ( I , C , D ) depends on γ and σ min ( C ) .
5.
The convergence of the linear regularized hybrid model HGZNN ( A + λ I , I , B ) ) depends on γ, σ min ( A ) and the regularization parameter λ > 0 , while the convergence of the linear regularized hybrid model HGZNN ( I , C + λ I , D ) ) depends on γ, σ min ( C ) and λ.
In conclusion, it is reasonable to analyze the system parameter selections to obtain better performance. But the best performance is not defined.

7. Conclusions

We show that the error functions which make the basis of GNN and ZNN dynamical evolutions can be defined using the gradient of the Frobenius norm of the traditional error function E ( t ) . The result of such a strategy is the usage of the error function E G ( t ) for the basis of GNN dynamics, which results in the proposed GGNN model. The results related to the GNN model (called GNN ( A , B , D ) ) for solving the general matrix equation A X B = D are extended in the GGNN model (called GGNN ( A , B , D ) ) in both theoretical and computational directions. In a theoretical sense, the convergence of the defined GGNN model is considered. It is shown that the neural state matrix V ( t ) of the GGNN ( A , B , D ) model asymptotically converges to the solution of the matrix equation A X B = D for an arbitrary initial state matrix V ( 0 ) and coincides with the general solution of the linear matrix equation. A number of applications of GNN(A, B, D) are considered. All applications are globally convergent. Several particular appearances of the general matrix equation are observed and applied for computing various classes of generalized inverses. Illustrative numerical examples and simulation results were obtained using Matlab Simulink implementation and are presented to demonstrate the validity of the derived theoretical results. The influence of various nonlinear activations on the GNN models is considered in both the theoretical and computational directions. From the presented examples, it can be concluded that the GGNN model is faster and has a smaller error compared to the GNN model.
Further research can be oriented to the definition of finite-time convergent GGNN or GZNN models, as well as the definition of a noise-tolerant GGNN or GZNN design.

Author Contributions

Conceptualization, P.S.S. and G.V.M.; methodology, P.S.S., N.T., D.G. and V.S.; software, D.G., V.L.K. and N.T.; validation, G.V.M., M.J.P. and P.S.S.; formal analysis, M.J.P., N.T. and D.G.; investigation, M.J.P., G.V.M. and P.S.S.; resources, D.G., N.T., V.L.K. and V.S.; data curation, M.J.P., V.L.K., V.S., D.G. and N.T.; writing—original draft preparation, P.S.S., D.G. and N.T.; writing—review and editing, M.J.P. and G.V.M.; visualization, D.G. and N.T.; supervision, G.V.M.; project administration, M.J.P.; funding acquisition, G.V.M., M.J.P. and P.S.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Ministry of Science and Higher Education of the Russian Federation (Grant No. 075-15-2022-1121).

Data Availability Statement

Data results are available on reader request.

Acknowledgments

Predrag Stanimirović is supported by the Science Fund of the Republic of Serbia (No. 7750185, Quantitative Automata Models: Fundamental Problems and Applications—QUAM). Dimitrios Gerontitis receives financial support from the ‘‘Savas Parastatidis’’ named scholarship granted provided by the Bodossaki Foundation. Milena J. Petrović acknowledges support from a project supported by Ministry of Education and Science of Republic of Serbia, Grant No. 174025.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses or interpretation of the data; in the writing of the manuscript, or in the decision to publish the results.

References

  1. Zhang, Y.; Chen, K. Comparison on Zhang neural network and gradient neural network for time-varying linear matrix equation AXB = C solving. In Proceedings of the 2008 IEEE International Conference on Industrial Technology, Chengdu, China, 21–24 April 2008; pp. 1–6. [Google Scholar] [CrossRef]
  2. Zhang, Y.; Yi, C.; Guo, D.; Zheng, J. Comparison on Zhang neural dynamics and gradient-based neural dynamics for online solution of nonlinear time-varying equation. Neural Comput. Appl. 2011, 20, 1–7. [Google Scholar] [CrossRef]
  3. Zhang, Y.; Xu, P.; Tan, L. Further studies on Zhang neural-dynamics and gradient dynamics for online nonlinear equations solving. In Proceedings of the 2009 IEEE International Conference on Automation and Logistics, Shenyang, China, 5–7 August 2009; pp. 566–571. [Google Scholar] [CrossRef]
  4. Ben-Israel, A.; Greville, T.N.E. Generalized Inverses: Theory and Applications, 2nd ed.; CMS Books in Mathematics; Springer: New York, NY, USA, 2003. [Google Scholar]
  5. Wang, G.; Wei, Y.; Qiao, S. Generalized Inverses: Theory and Computations; Science Press, Springer: Beijing, China, 2018. [Google Scholar]
  6. Dash, P.; Zohora, F.T.; Rahaman, M.; Hasan, M.M.; Arifuzzaman, M. Usage of Mathematics Tools with Example in Electrical and Electronic Engineering. Am. Sci. Res. J. Eng. Technol. Sci. (ASRJETS) 2018, 46, 178–188. [Google Scholar]
  7. Qin, F.; Lee, J. Dynamic methods for missing value estimation for DNA sequences. In Proceedings of the 2010 International Conference on Computational and Information Sciences, IEEE, Chengdu, China, 9–11 July 2010; pp. 442–445. [Google Scholar] [CrossRef]
  8. Soleimani, F.; Stanimirović, P.S.; Soleimani, F. Some matrix iterations for computing generalized inverses and balancing chemical equations. Algorithms 2015, 8, 982–998. [Google Scholar] [CrossRef]
  9. Udawat, B.; Begani, J.; Mansinghka, M.; Bhatia, N.; Sharma, H.; Hadap, A. Gauss Jordan method for balancing chemical equation for different materials. Mater. Today Proc. 2022, 51, 451–454. [Google Scholar] [CrossRef]
  10. Doty, K.L.; Melchiorri, C.; Bonivento, C. A theory of generalized inverses applied to robotics. Int. J. Robot. Res. 1993, 12, 1–19. [Google Scholar] [CrossRef]
  11. Li, L.; Hu, J. An efficient second-order neural network model for computing the Moore–Penrose inverse of matrices. IET Signal Process. 2022, 16, 1106–1117. [Google Scholar] [CrossRef]
  12. Wang, X.; Tang, B.; Gao, X.G.; Wu, W.H. Finite iterative algorithms for the generalized reflexive and anti-reflexive solutions of the linear matrix equation AXB = C. Filomat 2017, 31, 2151–2162. [Google Scholar] [CrossRef]
  13. Ding, F.; Chen, T. Gradient based iterative algorithms for solving a class of matrix equations. IEEE Trans. Autom. Control 2005, 50, 1216–1221. [Google Scholar] [CrossRef]
  14. Ding, F.; Zhang, H. Gradient-based iterative algorithm for a class of the coupled matrix equations related to control systems. IET Control Theory Appl. 2014, 8, 1588–1595. [Google Scholar] [CrossRef]
  15. Zhang, H. Quasi gradient-based inversion-free iterative algorithm for solving a class of the nonlinear matrix equations. Comput. Math. Appl. 2019, 77, 1233–1244. [Google Scholar] [CrossRef]
  16. Wang, J. Recurrent neural networks for computing pseudoinverses of rank-deficient matrices. SIAM J. Sci. Comput. 1997, 18, 1479–1493. [Google Scholar] [CrossRef]
  17. Fa-Long, L.; Zheng, B. Neural network approach to computing matrix inversion. Appl. Math. Comput. 1992, 47, 109–120. [Google Scholar] [CrossRef]
  18. Wang, J. A recurrent neural network for real-time matrix inversion. Appl. Math. Comput. 1993, 55, 89–100. [Google Scholar] [CrossRef]
  19. Wang, J. Recurrent neural networks for solving linear matrix equations. Comput. Math. Appl. 1993, 26, 23–34. [Google Scholar] [CrossRef]
  20. Wei, Y. Recurrent neural networks for computing weighted Moore–Penrose inverse. Appl. Math. Comput. 2000, 116, 279–287. [Google Scholar] [CrossRef]
  21. Xiao, L.; Zhang, Y.; Li, K.; Liao, B.; Tan, Z. FA novel recurrent neural network and its finite-time solution to time-varying complex matrix inversion. Neurocomputing 2019, 331, 483–492. [Google Scholar] [CrossRef]
  22. Yi, C.; Chen, Y.; Lu, Z. Improved gradient-based neural networks for online solution of Lyapunov matrix equation. Inf. Process. Lett. 2011, 111, 780–786. [Google Scholar] [CrossRef]
  23. Yi, C.; Qiao, D. Improved neural solution for the Lyapunov matrix equation based on gradient search. Inf. Process. Lett. 2013, 113, 876–881. [Google Scholar]
  24. Xiao, L.; Li, K.; Tan, Z.; Zhang, Z.; Liao, B.; Chen, K.; Jin, L.; Li, S. Nonlinear gradient neural network for solving system of linear equations. Inf. Process. Lett. 2019, 142, 35–40. [Google Scholar] [CrossRef]
  25. Xiao, L. A finite-time convergent neural dynamics for online solution of time-varying linear complex matrix equation. Neurocomputing 2015, 167, 254–259. [Google Scholar] [CrossRef]
  26. Lv, X.; Xiao, L.; Tan, Z.; Yang, Z.; Yuan, J. Improved Gradient Neural Networks for solving Moore–Penrose Inverse of full-rank matrix. Neural Process. Lett. 2019, 50, 1993–2005. [Google Scholar] [CrossRef]
  27. Wang, J. Electronic realisation of recurrent neural network for solving simultaneous linear equations. Electron. Lett. 1992, 28, 493–495. [Google Scholar] [CrossRef]
  28. Zhang, Y.; Chen, K.; Tan, H.Z. Performance analysis of gradient neural network exploited for online time-varying matrix inversion. IEEE Trans. Autom. Control. 2009, 54, 1940–1945. [Google Scholar] [CrossRef]
  29. Wang, J.; Li, H. Solving simultaneous linear equations using recurrent neural networks. Inf. Sci. 1994, 76, 255–277. [Google Scholar] [CrossRef]
  30. Tan, Z.; Chen, H. Nonlinear function activated GNN versus ZNN for online solution of general linear matrix equations. J. Frankl. Inst. 2023, 360, 7021–7036. [Google Scholar] [CrossRef]
  31. Tan, Z.; Hu, Y.; Chen, K. On the investigation of activation functions in gradient neural network for online solving linear matrix equation. Neurocomputing 2020, 413, 185–192. [Google Scholar] [CrossRef]
  32. Tan, Z. Fixed-time convergent gradient neural network for solving online sylvester equation. Mathematics 2022, 10, 3090. [Google Scholar] [CrossRef]
  33. Wang, D.; Liu, X.-W. A gradient-type noise-tolerant finite-time neural network for convex optimization. Neurocomputing 2022, 49, 647–656. [Google Scholar] [CrossRef]
  34. Stanimirović, P.S.; Petković, M.D. Gradient neural dynamics for solving matrix equations and their applications. Neurocomputing 2018, 306, 200–212. [Google Scholar] [CrossRef]
  35. Stanimirović, P.S.; Katsikis, V.N.; Li, S. Hybrid GNN-ZNN models for solving linear matrix equations. Neurocomputing 2018, 316, 124–134. [Google Scholar] [CrossRef]
  36. Sowmya, G.; Thangavel, P.; Shankar, V. A novel hybrid Zhang neural network model for time-varying matrix inversion. Eng. Sci. Technol. Int. J. 2022, 26, 101009. [Google Scholar] [CrossRef]
  37. Wu, W.; Zheng, B. Improved recurrent neural networks for solving Moore–Penrose inverse of real-time full-rank matrix. Neurocomputing 2020, 418, 221–231. [Google Scholar] [CrossRef]
  38. Zhang, Y.; Wang, C. Gradient-Zhang neural network solving linear time-varying equations. In Proceedings of the 2022 IEEE 17th Conference on Industrial Electronics and Applications (ICIEA), Chengdu, China, 16–19 December 2022; pp. 396–403. [Google Scholar] [CrossRef]
  39. Wang, C.; Zhang, Y. Theoretical Analysis of Gradient-Zhang Neural Network for Time-Varying Equations and Improved Method for Linear Equations. In Neural Information Processing; ICONIP 2023, Lecture Notes in Computer Science; Luo, B., Cheng, L., Wu, Z.G., Li, H., Li, C., Eds.; Springer: Singapore, 2024; Volume 14447. [Google Scholar] [CrossRef]
  40. Stanimirović, P.S.; Mourtas, S.D.; Katsikis, V.N.; Kazakovtsev, L.A. Krutikov, V.N. Recurrent neural network models based on optimization methods. Mathematics 2022, 10, 4292. [Google Scholar] [CrossRef]
  41. Nocedal, J.; Wright, S. Numerical Optimization; Springer: New York, NY, USA, 1999. [Google Scholar]
  42. Stanimirović, P.S.; Petković, M.D.; Gerontitis, D. Gradient neural network with nonlinear activation for computing inner inverses and the Drazin inverse. Neural Process. Lett. 2017, 48, 109–133. [Google Scholar] [CrossRef]
  43. Smoktunowicz, A.; Smoktunowicz, A. Set-theoretic solutions of the Yang–Baxter equation and new classes of R-matrices. Linear Algebra Its Appl. 2018, 546, 86–114. [Google Scholar] [CrossRef]
  44. Baksalary, O.M.; Trenkler, G. On matrices whose Moore–Penrose inverse is idempotent. Linear Multilinear Algebra 2022, 70, 2014–2026. [Google Scholar] [CrossRef]
  45. Wang, X.Z.; Ma, H.; Stanimirović, P.S. Nonlinearly activated recurrent neural network for computing the Drazin inverse. Neural Process. Lett. 2017, 46, 195–217. [Google Scholar] [CrossRef]
  46. Wang, S.D.; Kuo, T.S.; Hsu, C.F. Trace bounds on the solution of the algebraic matrix Riccati and Lyapunov equation. IEEE Trans. Autom. Control 1986, 31, 654–656. [Google Scholar] [CrossRef]
Figure 1. Simulink implementation of GGNN ( A , B , D ) evolution (10).
Figure 1. Simulink implementation of GGNN ( A , B , D ) evolution (10).
Axioms 13 00049 g001
Figure 2. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. E G ( t ) F in GGNN ( ( A T A ) 2 , I , A T A A T ) compared to GNN ( A T A , I 4 , A T ) in Example 1.
Figure 2. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. E G ( t ) F in GGNN ( ( A T A ) 2 , I , A T A A T ) compared to GNN ( A T A , I 4 , A T ) in Example 1.
Axioms 13 00049 g002
Figure 3. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. V ( t ) V * F in GGNN ( ( A T A ) 2 , I , A T A A T ) compared to GNN ( A T A , I 4 , A T ) in Example 1.
Figure 3. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. V ( t ) V * F in GGNN ( ( A T A ) 2 , I , A T A A T ) compared to GNN ( A T A , I 4 , A T ) in Example 1.
Axioms 13 00049 g003
Figure 4. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. Elementwise convergence trajectories v i j V ( t ) of the GGNN ( A , B , D ) network in Example 2.
Figure 4. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. Elementwise convergence trajectories v i j V ( t ) of the GGNN ( A , B , D ) network in Example 2.
Axioms 13 00049 g004
Figure 5. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. E G ( t ) F in GGNN ( A , B , D ) compared to GNN ( A , B , D ) in Example 2.
Figure 5. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. E G ( t ) F in GGNN ( A , B , D ) compared to GNN ( A , B , D ) in Example 2.
Axioms 13 00049 g005
Figure 6. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. E ( t ) F in GGNN ( A , B , D ) compared to GNN ( A , B , D ) in Example 2.
Figure 6. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. E ( t ) F in GGNN ( A , B , D ) compared to GNN ( A , B , D ) in Example 2.
Axioms 13 00049 g006
Figure 7. (a) γ = 10 , t [ 0 , 10 2 ] . (b) γ = 10 3 , t [ 0 , 10 3 ] . (c) γ = 10 6 , t [ 0 , 10 6 ] . E ( t ) F for different γ in GGNN ( I , A A T , A T ) compared to GNN ( I , A , I ) in Example 3.
Figure 7. (a) γ = 10 , t [ 0 , 10 2 ] . (b) γ = 10 3 , t [ 0 , 10 3 ] . (c) γ = 10 6 , t [ 0 , 10 6 ] . E ( t ) F for different γ in GGNN ( I , A A T , A T ) compared to GNN ( I , A , I ) in Example 3.
Axioms 13 00049 g007
Figure 8. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. E G ( t ) F in GGNN ( I , A A T , A T ) compared to GNN ( I , A , I ) in Example 3.
Figure 8. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. E G ( t ) F in GGNN ( I , A A T , A T ) compared to GNN ( I , A , I ) in Example 3.
Axioms 13 00049 g008
Figure 9. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. E ( t ) F in GGNN ( A 1 T A 1 , I 3 , A 1 T D ) compared to GNN ( A 1 , I 3 , D ) in Example 4.
Figure 9. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. E ( t ) F in GGNN ( A 1 T A 1 , I 3 , A 1 T D ) compared to GNN ( A 1 , I 3 , D ) in Example 4.
Axioms 13 00049 g009
Figure 10. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. E G ( t ) F in GGNN ( A 1 T A 1 , I 3 , A 1 T D ) compared to GNN ( A 1 , I 3 , D ) in Example 4.
Figure 10. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. E G ( t ) F in GGNN ( A 1 T A 1 , I 3 , A 1 T D ) compared to GNN ( A 1 , I 3 , D ) in Example 4.
Axioms 13 00049 g010
Figure 11. Simulink implementation of (25).
Figure 11. Simulink implementation of (25).
Axioms 13 00049 g011
Figure 12. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. E A , I , B F of HGZNN ( A , I , I ) compared to GGNN ( A , I , I ) and GZNN ( A , I , I ) in Example 6.
Figure 12. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. E A , I , B F of HGZNN ( A , I , I ) compared to GGNN ( A , I , I ) and GZNN ( A , I , I ) in Example 6.
Axioms 13 00049 g012
Figure 13. (a) Linear activation. (b) Power–sigmoid activation. (c) Smooth power–sigmoid activation. A 1 B V ( t ) F of HGZNN ( A , I , I ) compared to GGNN ( A , I , I ) and GZNN ( A , I , I ) in Example 6.
Figure 13. (a) Linear activation. (b) Power–sigmoid activation. (c) Smooth power–sigmoid activation. A 1 B V ( t ) F of HGZNN ( A , I , I ) compared to GGNN ( A , I , I ) and GZNN ( A , I , I ) in Example 6.
Axioms 13 00049 g013
Figure 14. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. Elementwise convergence trajectories of the HGZNN ( A , I , B ) network in Example 7.
Figure 14. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. Elementwise convergence trajectories of the HGZNN ( A , I , B ) network in Example 7.
Axioms 13 00049 g014
Figure 15. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. E A , I , B F of HGZNN ( A , I , B ) compared to GGNN ( A , I , B ) and GZNN ( A , I , B ) in Example 7.
Figure 15. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. E A , I , B F of HGZNN ( A , I , B ) compared to GGNN ( A , I , B ) and GZNN ( A , I , B ) in Example 7.
Axioms 13 00049 g015
Figure 16. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. Frobenius norm of error matrix A 1 B X ( t ) of HGZNN ( A , I , B ) compared to GGNN ( A , I , B ) and GZNN ( A , I , B ) in Example 7.
Figure 16. (a) Linear activation. (b) Power-sigmoid activation. (c) Smooth power–sigmoid activation. Frobenius norm of error matrix A 1 B X ( t ) of HGZNN ( A , I , B ) compared to GGNN ( A , I , B ) and GZNN ( A , I , B ) in Example 7.
Axioms 13 00049 g016
Table 1. Input data.
Table 1. Input data.
Matrix AMatrix BMatrix DInput and Residual Norm
m n rank ( A ) p q rank ( B ) m q rank ( D ) γ t f AA DB B D F
10889771077 10 4 0.51.051
10869771077 10 4 0.51.318
10869751077 10 4 0.51.81
10869751075 10 4 52.048
10819721071 10 4 52.372
2010108552055 10 6 51.984
201058552055 10 6 52.455
201058522055 10 6 13.769
201028522052 10 6 12.71
2015155222022 10 8 11.1
2015105222022 10 8 11.158
2015105212022 10 8 12.211
201555212022 10 8 11.726
Table 2. Experimental results based on data presented in Table 1.
Table 2. Experimental results based on data presented in Table 1.
| | E ( t ) | | F (GNN) | | E ( t ) | | F (GGNN) | | E G ( t ) | | F (GNN) | | E G ( t ) | | F (GGNN)CPU (GNN)CPU (GGNN)
1 . 051 1.094 2 . 52 × 10 9 0.02524 5 . 017148 13.470995
1 . 318 1.393 3 . 122 × 10 7 0.0366122.753954 10 . 734163
1 . 811 1.899 0 . 0008711 0.0394715.754537 15 . 547785
2 . 048 2.082 1 . 96 × 10 10 0.00964 9 . 435709 17.137916
2 . 372 2 . 372 2 1 . 7422 × 10 15 2.003 × 10 15 21.645386 13 . 255210
1 . 984 1 . 984 2.288 × 10 14 9.978 × 10 15 21.64538613.255210
2.4552.4551.657 × 10 11 1.693 × 10 14 50.84689319.059385
3.7693.7696.991 × 10 11 4.071 × 10 14 42.18474813.722390
2.712.711.429 × 10 14 1.176 × 10 14 148.48425813.527065
1.11.11.766 × 10 13 5.949 × 10 15 218.16937617.5666568
1.1581.1582.747 × 10 10 2.981 × 10 13 45.50561812.441782
2.2112.2117.942 × 10 12 8.963 × 10 14 194.60513314.117241
1.7261.7268.042 × 10 15 3.207 × 10 15 22.34050111.650829
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Stanimirović, P.S.; Tešić, N.; Gerontitis, D.; Milovanović, G.V.; Petrović, M.J.; Kazakovtsev, V.L.; Stasiuk, V. Application of Gradient Optimization Methods in Defining Neural Dynamics. Axioms 2024, 13, 49. https://doi.org/10.3390/axioms13010049

AMA Style

Stanimirović PS, Tešić N, Gerontitis D, Milovanović GV, Petrović MJ, Kazakovtsev VL, Stasiuk V. Application of Gradient Optimization Methods in Defining Neural Dynamics. Axioms. 2024; 13(1):49. https://doi.org/10.3390/axioms13010049

Chicago/Turabian Style

Stanimirović, Predrag S., Nataša Tešić, Dimitrios Gerontitis, Gradimir V. Milovanović, Milena J. Petrović, Vladimir L. Kazakovtsev, and Vladislav Stasiuk. 2024. "Application of Gradient Optimization Methods in Defining Neural Dynamics" Axioms 13, no. 1: 49. https://doi.org/10.3390/axioms13010049

APA Style

Stanimirović, P. S., Tešić, N., Gerontitis, D., Milovanović, G. V., Petrović, M. J., Kazakovtsev, V. L., & Stasiuk, V. (2024). Application of Gradient Optimization Methods in Defining Neural Dynamics. Axioms, 13(1), 49. https://doi.org/10.3390/axioms13010049

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop