This section elaborates on the multi-mode EAF harmonic model first. Next, the DCMM is proposed.
2.1. Multi-Mode EAF Harmonic Model
The non-linear load model shown as Equation (1) depends on fundamental and harmonic voltages [
24,
25,
26,
27],
where
Ih is the
h-th harmonic current,
U1 denotes the fundamental voltage,
Um is the
m-th harmonic voltage, and
C is the constant coefficient.
The above-mentioned non-linear load model is not suitable for the EAF as the correlation between the harmonic currents and fundamental current is more prominent than that between the harmonic currents and the fundamental voltage. Therefore, the non-linear coupling model based on the interactions among the fundamental current, the harmonic voltages, and the harmonic currents is presented as follows,
where
I1 denotes the fundamental current.
The non-linear function,
Fh, leads to the heavy computational cost of harmonic power-flow calculation, due to the iterative solution of the system equations and non-linear coupling model [
20,
21]. Therefore, a non-linear coupling model is approximated to a linear coupling model, as shown in Equation (3). In this way, the computation burden is reduced while the convergence precision is maintained.
where
A is the coupling matrix.
Equation (3) can be rewritten as follows by denoting [
I1,
U2,
U3, …,
Um]
T as
Wm,
The coupling matrix A and the constant-coefficient C represent the coupling relationship among the fundamental component and harmonic components. Note that the harmonic components vary with the modes of the EAF smelting process. Therefore, it is necessary to classify the measured data according to operating modes and estimate A and C from the respective data clusters. It is assumed that an EAF smelting process has n operating modes, i.e., the multi-mode EAF harmonic model consists of n linear coupling models shown as Equation (4), the next goal of this paper is to identify the model parameters A, C, A, C, …, A(n), and C(n).
However, the dimension of the concerned harmonics of Equation (4) is normally 25; it is difficult to classify such high-dimensional spatial data and identify the model parameters accurately. Therefore, the PCA was adopted to reduce the dimensions of the dataset, so that we could obtain accurate model parameters. The PCA adjusts the coordinate axes of the dataset to guarantee that the variances of the adjusted coordinate axes are following a diminishing order. The first several coordinate axes were employed to classify the measured data of EAF because the coordinate axes with small enough variances were unable to distinguish the data. That is, the PCA can extract a smaller representative dataset from a larger one [
28]. According to the description above, the process of model simplification is elaborated as follows.
We suppose that the dimension of
Wm,
m, is reduced to
j by PCA, then
Wm is processed to
Rj as follows,
where
Q denotes a matrix consisting of the selected eigenvectors of the covariance matrix of
Wm.
Rj can be represented as [
r1,
r2,
r3, …,
rj]
T, where
r1,
r2,
r3, …, and
rj are the first
j principal components of
Wm.
Consequently, the simplified model based on PCA is formulated as follows
Equation (6) is rewritten as follows by denoting
AQ−1 as
Ap,
where
Ap is the simplified coupling matrix.
The linear coupling relationship among different frequencies of harmonics is consistent before and after model simplification [
29]. Additionally, the modeling precision and computational efficiency are improved since model simplification filters out redundant and correlated features of the measured data of EAF [
30,
31,
32].
Thus, the next goal of this paper is converted to identify the simplified model parameters
Ap,
C,
Ap,
C, …,
Ap(n), and
C(n), then the model parameters
A,
C,
A,
C, …,
A(n), and
C(n) can be calculated according to Equation (8),
where
Q is the known matrix obtained from the PCA.
Based on the discussion above, the parameter identification method of the multi-mode EAF harmonic model will be elaborated in
Section 2.2.
2.2. Data-Driven Compartmental Modeling Method (DCMM)
In this subsection, the DCMM based on the multi-mode EAF harmonic model is proposed. As shown in
Figure 1, the proposed DCMM is illustrated. First, the
h-th harmonic current data, the fundamental current data, and the 2-nd to
m-th harmonic voltage data were normalized according to the short-circuit capacity and base voltage, and the dimensions of the normalized dataset were reduced by PCA. Furthermore, the optimal number of clusters and initial clustering centers were calculated based on the sum of the squared error (SSE) and PSO, respectively. Then the preprocessed data was separated into several clusters based on the K-means algorithm. Thereafter, the simplified model parameters
Ap,
C,
Ap,
C, …,
Ap(n), and
C(n) were identified. Finally, the model parameters
A,
C,
A,
C, …,
A(n), and
C(n) were calculated.
In our method, the PSO-based K-means algorithm was adopted to divide the multi-mode EAF harmonic dataset into several clusters corresponding to the different modes of the EAF smelting process. We adopted the SSE and PSO, which is different from the conventional K-means algorithm, to calculate the optimal number of clusters and initial clustering centers to improve clustering accuracy. Moreover, the clustering center and distance measure of the PSO-based K-means algorithm were redefined to ensure that the volt-ampere characteristic of each mode of the EAF smelting process is linear.
Considering that the number of clusters of the K-means algorithm needs be set beforehand, we introduced SSE formulated as follows to determine the optimal number of clusters,
where
k is the number of clusters,
Yi is the
i-th cluster,
q denotes a data point belonging to
Yi, and
vi is the center of
Yi.
With the increasing number of clusters, SSE decreases, and SSE will decrease markedly if k approximates the optimal number of clusters. Therefore, we calculated SSEs under different conditions (k = 1,2, …,10) and take k at the marked decline of SSEs as the optimal number of clusters.
PSO was employed to obtain the initial clustering center of the K-means algorithm rather than randomly initializing the K-means algorithm [
33]. Compared with the random initialization K-means algorithm, it was more efficient for the PSO-based K-means algorithm to search for the near-global solution or global optimal solution and enhance the clustering accuracy and computational efficiency [
34,
35].
The clustering center and distance measure of the PSO-based K-means algorithm were redefined as follows,
where
a1,
a2, …,
aj, and
c are constant coefficients,
j denotes the number of dimensions of matrix
Rj,
dis(i) is the distance between the
i-th data point and the clustering center, and
r1(i),
r2(i), …,
rj(i), and
Ih(i) are coordinates of the
i-th data point.
The redefined clustering center is linear instead of a point, which is different from the clustering center of the conventional K-means algorithm. Such a redefinition can guarantee that the volt-ampere characteristic of each mode of the EAF smelting process is linear.
The clustering process is illustrated in
Figure 2. First, the distances between each data point and different clustering centers are calculated. Furthermore, the data points are divided into the nearest cluster. Thereafter, several clustering centers are recalculated. Finally, if the clustering centers change, return to Step 1, otherwise, the clustering centers are obtained.
Based on the discussion above, our method is able to partition the multi-mode EAF harmonic dataset into several clusters, such that the objects with shared linear volt-ampere characteristics are compact, and the objects with different linear volt-ampere characteristics are well separated.
Finally, the simplified model parameters Ap, C, Ap, C, …, Ap(n), and C(n) were identified by the least square fitting, and the model parameters A, C, A, C, …, A(n), and C(n) were calculated according to Equation (8). In this way, the multi-mode EAF harmonic model was established by the proposed DCMM.