1. Introduction
Dynamic security assessment (DSA) evaluates the security and stability of the power system under specified operating conditions (OCs) and anticipated faults [
1]. It can provide valuable information for preventive and restoration control decision-making [
2,
3]. With the integration of renewable energy and frequent extreme weather events, the uncertainty of the OCs has increased significantly. Preventive and restoration control decision-making needs to be performed online to adapt to fast-varying OCs. A rapid DSA is required. However, the traditional DSA method based on time-domain simulation is computationally complex [
4]. It cannot meet the high assessment efficiency for massive candidate schemes generated during the control decision-making process.
The data-driven DSA method utilizes machine learning to directly learn the map between the input and the dynamic security of power systems [
5,
6]. It can realize an online assessment due to its high efficiency. The input features are fundamental to data-driven DSA. For a specific fault type, the dynamic security of the power system is closely related to the pre-fault OCs and fault locations. Therefore, the DSA inputs need to incorporate both OCs and the fault location information. To ensure high evaluation efficiency, the DSA typically employs steady-state features as inputs [
7,
8,
9]. Pre-fault OCs can be represented by steady-state power flow features, such as the node injection power, voltage, and line transmission power [
10,
11]. For fault locations, the traditional DSA method constructs a dedicated assessment model for each fault. It has high assessment accuracy [
12,
13]. However, an increase in fault locations can lead to a large scale of DSA model sets and high maintenance costs.
Existing DSA methods with a unified assessment model adapted to fault locations can be divided into three categories. The first category leverages machine learning techniques to handle multiple fault locations. Assuming that each OC corresponds to a single anticipated fault, Refs. [
14,
15] employ transfer learning to build DSA models for multiple faults, of which the accuracy is over 97%. However, they do not include fault location information in the inputs and cannot assess different fault locations under the same OCs. Ref. [
16] employs multi-label learning to construct a single DSA model for multiple faults, and improves the DSA accuracy to 99%. The number of considered fault locations is determined by training sample labels. Therefore, the DSA model has no generalization performance in unseen fault locations. The second category employs discrete encoding methods such as binary [
17], integer [
18,
19,
20,
21], and one-hot encoding [
22,
23] to represent fault location features. DSA models distinguish different fault locations through these discrete encodings and are generally able to achieve an assessment accuracy of more than 97%. However, discrete encoding fails to reflect the physical differences between fault locations. The discrete encoding-based DSA model has poor generalization performance in unseen fault locations. The third category constructs electrical coordinates to represent fault locations [
24,
25]. The electrical coordinate is composed of the electrical distance to preselected reference nodes. It can reflect the electrical differences among fault locations. The electrical coordinate-based DSA model has high generalization performance and obtains an assessment accuracy of 98%.
The mapping relationship among OCs, fault locations, and the system dynamic security is strongly nonlinear. To ensure high assessment accuracy, the training set needs to cover a wide variation range of OCs and anticipated fault locations. The size of the training set is thereby large. The learning for massive samples containing a nonlinear map requires a machine learning model with strongly nonlinear fitting capabilities. Typically, increasing the structural complexity of machine learning models can enhance nonlinear fitting capabilities, e.g., long short-term memory (LSTM) neural networks [
26], convolutional neural networks (CNNs) [
27], and recurrent neural networks (RNNs) [
28]. However, these highly complex models face issues such as high training costs, overfitting risks, poor interpretability, and difficulties in model updating. They are hard to update timely to support online preventive and restoration control decision-making.
According to the principle of Occam’s Razor in machine learning, the models with simple structures are preferentially adopted within acceptable training errors. Therefore, this paper proposes a dynamic security partition assessment method of the power system. Instead of a single high-complexity unified assessment model, multiple low-complexity partition assessment models are employed without an accuracy loss. The aim is to achieve an accurate and incrementally updated DSA using low-complexity models. The contributions of this paper are as follows:
- (1)
A self-adaptive power grid partition method that strictly accounts for electrical connections is proposed based on the mean shift algorithm. The node locations are represented by a symmetric electrical distance matrix, and the Chebyshev distance is introduced to measure the differences between node location features.
- (2)
To build the input feature set for DSA, high-level steady-state features of OCs are extracted based on a stacked denoising autoencoder (SDAE). Meanwhile, the symmetric electrical distance matrix is modified to represent fault locations in local regions, which reduces the redundancy of fault location features.
- (3)
Multiple DSA models are constructed for fault locations in each local region. The model structure is the radial basis function neural networks (RBFNN) integrated with the Chebyshev distance, which can achieve a close assessment accuracy to complex models. Moreover, an incremental updating strategy is designed to enhance the adaptability of the DSA model.
The rest of this paper is organized as follows:
Section 2 introduces a self-adaptive power grid partition method that strictly accounts for electrical connections. The input feature set for DSA is constructed in
Section 3.
Section 4 constructs DSA models for fault locations in each local region and designs an incremental updating strategy. Test results for a simplified provincial system and a larger-scale practical system in China are presented in
Section 5. Conclusions are drawn in
Section 6.
2. Power Grid Partition with a Strict Consideration of Electrical Connections
2.1. Description of the Power Grid Partition Problem for DSA
The impacts of different fault locations on the power system dynamic security have local similarities. That is, fault locations with tight electrical connections have similar effects on the dynamic security and vice versa. The tightness of electrical connections is typically quantified as the impedance-based electrical distance [
29,
30]. Therefore, the nodes with close electrical distances should be within the same region, while nodes with distant distances are in separate regions. Constructing dedicated models for each region can reduce the difficulty of training DSA models.
The impedance-based electrical distance between nodes in a power grid is typically defined as the equivalent impedance [
30]:
where
dij represents the electrical distance between nodes
i and
j;
Zii and
Zjj denote the self-impedances of nodes
i and
j, respectively; and
Zij denotes the mutual impedance between nodes
i and
j.
To quantify the quality of the power grid partition, an evaluation index of electrical modularity
Qe is defined as Equation (2) based on complex network theory and the concept of weighted modularity [
31].
where
n denotes the number of nodes in the power grid;
Wi denotes the sum of edge weights of node
i;
δ(i,
j)∈{0, 1} denotes whether nodes
i and
j belong to the same region;
δ(i,
j) = 1 indicates the same partition, while 0 indicates different regions; and
M denotes the sum of the edge weights among all nodes. The edge weight
wij between nodes
i and
j is represented by electrical distance. A higher
Qe indicates a better partition quality.
Clustering algorithms are common methods for partitioning power grids [
32,
33], which can be categorized into prototype, density, and hierarchical clustering. Prototype clustering, e.g., the K-means algorithm, requires a predetermination of the partition number [
34], while hierarchical clustering has high computational complexity in large-scale power grids. Without specifying the partition number, the mean shift algorithm in density clustering can achieve a self-adaptive power grid partition with low computational costs [
35,
36]. Therefore, the mean shift algorithm is adopted in this paper.
Considering the characteristics of the DSA, the symmetric electrical distance matrix and Chebyshev distance are integrated into the mean shift algorithm for the power grid partition. The mean shift-based power grid partition is introduced from four aspects: feature representation, algorithm structure, partition steps, and distance metric.
2.2. Feature Representation of the Node Location Based on a Symmetric Electrical Distance Matrix
Clustering algorithms rely on input features to distinguish electrical differences among node locations. Therefore, the feature representation of a node location is supposed to reflect these electrical differences.
The symmetric electrical distance matrix, denoted as
D, is composed of the electrical distance between any two nodes. It can be represented as follows:
D is a
n-order and symmetric matrix due to the symmetry of the electrical distance (i.e.,
dij =
dji). Each row of the matrix
D is a feature vector of the corresponding node location. For example, the location of node
i can be represented as an
n-dimensional feature vector
xi = (
di1,
di2, …,
din), which is called the “electrical coordinate” of node
i. Ref. [
24] pointed out that the electrical coordinate can effectively reflect the electrical differences among nodes. Therefore, the electrical coordinates are selected as the node location features for the mean shift algorithm.
The row vector can be further called the full-dimensional electrical coordinate since the matrix D in Equation (4) is a complete symmetry matrix of the order n × n. If D is an incomplete and unsymmetric matrix of the order n × k (k < n), the feature vector of node i is a k-dimensional electrical coordinate, denoted as xi = (di1, di2, …, dik). The electrical coordinate is composed of the electrical distances to preselected reference nodes. Its dimension is equal to the number of reference nodes. The reference nodes of the full-dimensional electrical coordinate include all nodes in the whole power grid.
The symmetric electrical distance matrix is sensitive to changes in the power grid topology. It needs to be recalculated when the outage of a branch occurs. The common recalculation method is to modify the node admittance matrix and obtain the new node impedance matrix by inversion. The new symmetric electrical distance matrix is obtained by Equation (1). The computation time is high due to matrix inversion. To improve the recalculation efficiency, the branch addition method is adopted.
As shown in
Figure 1,
Z0 is the line impedance of line
lij. Disconnecting line
lij is equivalent to adding a new line of −
Z0 to line
lij. The new node impedance matrix
Z′ can be represented as follows:
where
AM is a n-order column vector, which can be represented as [0, …, 1, …, −1, …,0]. Its
i-th element is 1, and the
j-th element is −1. Without the complex matrix inversion, the new node impedance matrix can be obtained rapidly. The recalculation efficiency of the symmetric electrical distance matrix is thereby improved significantly.
2.3. Mean Shift Algorithm
The underlying assumption of the mean shift algorithm is that sample points in different clusters follow different probability density distributions [
37,
38]. Each sample point performs iterations along the direction of the steepest increase in density and converges to the region with the highest local density. Sample points converging to the same region belong to the same cluster. With the two-dimensional feature plane in
Figure 2 as an example, the iteration process of density centers for sample points is elaborated as follows. The colored circle represents the distribution of the sample point in the feature plane. The arrow represents the direction of the shift vector.
A sample point
x is randomly selected as the initial density center. Based on all sample points within a circle centered at
x with radius
r, the shift vector
xx(1) is calculated. The direction of the shift vector is toward the steepest increase in the density of sample distribution. The endpoint
x(1) of the vector
xx(1) is selected as the new density center, and a new shift vector
x(1)x(2) is calculated. The above process is repeated until the final shift vector ||
x(m−1)x(m)|| is less than a predefined threshold. The iteration process converges, and the endpoint
x(m) is considered the converged density center for sample point
x. The shift vector
M(x) for sample point
x is calculated as follows [
35]:
where
x represents the feature vector of the sample point
x, i.e., electrical coordinate;
Sx denotes the high-dimensional spherical region around the sample point
x;
G((
xi-
x)/
σ) represents the Gaussian weight [
39]; and
σ is the bandwidth of the Gaussian weight. After one shift, the feature vector of the density center point
x(1) can be represented as follows:
The density center iteration is performed for all sample points until the iteration process of all sample points converges. The clusters are determined according to the converged density centers of sample points. Sample points converging to the same region are divided into the same cluster, while those in different regions are divided into different clusters.
2.4. Self-Adaptive Power Grid Partition Based on the Mean Shift Algorithm
The power grid partition based on the mean shift algorithm is self-adaptive without specifying the partition number beforehand. The detailed partition steps are as follows:
- (1)
Compute symmetric electrical distance matrix, extract the electrical coordinate of each node, and set the iteration termination threshold ds and aggregation threshold da;
- (2)
Calculate the shift vector M(x) of each node;
- (3)
Move each node according to its shift vector M(x);
- (4)
Repeat Steps (2) and (3) until ||M(x)||<ds and the density center iteration process for each node has converged;
- (5)
If the distance between the converged density centers of two nodes is less than da, these two nodes are aggregated to the same cluster, i.e., the same region; otherwise, these two nodes are assigned to different regions.
Step (1) involves feature input and parameter setting. Steps (2) to (4) constitute the density center iteration process for each node. Step (5) is the node aggregation process. To ensure the topological connectivity of each local region after partitioning, a hierarchical aggregation method is adopted in Step (5).
The node aggregation process considering topological connectivity is shown in
Figure 3. The point denotes the node; the blue point denotes the aggregation center node. lines represent topological connections between nodes, numbers on the points indicate the connectivity level of each node to the aggregation center node, green denotes division into the same region, and white denotes separation into different regions. One-level connectivity indicates a direct connection, while two-level connectivity indicates connection through one intermediate node. Firstly, as shown in
Figure 3a, a node is randomly selected as an aggregation center node, i.e., the blue point. Nodes that are 1-level connected to the blue point are identified based on the adjacency matrix. If the distance between the converged density centers of the 1-level connected node and the blue point is less than
da, these two nodes are partitioned into the same region; otherwise, these two nodes belong to different regions. Then, as shown in
Figure 3b, 2-level connected nodes that are directly connected to 1-level connected nodes in the current region are identified. The distance between the converged density centers of 2-level connected nodes and the blue point is calculated, and the nodes within the same region are screened. The node aggregation process continues until no more higher-level connected nodes within the same region can be found. The above process is performed repeatedly for another unpartitioned node until all nodes are partitioned. The aggregation process of the power grid is terminated.
In large-scale power systems, the computation cost of the mean shift algorithm is high, and the power grid partition efficiency is decreased. To boost the partition efficiency, this paper makes the following two improvements: (1) Reduce the number of nodes to be clustered. The generator bus belongs to a one-degree node, which is only connected to one terminal bus. To enhance partition efficiency, generator buses do not participate in clustering. They are directly partitioned into the same region as the connected terminal buses. (2) Adopt parallel computation. In the mean shift algorithm, the density center iteration for each node is independent of others and only depends on the initial node distribution. It can be accelerated by parallel computation and is not affected by each other.
During the clustering process, small regions consisting of a few nodes may be formed. Constructing a DSA model for each small region can increase the number of models, occupying many computation and storage resources. To reduce the number of DSA models, the partition results of the power grid need to be fine-tuned. Small regions are aggregated into the adjacent regions.
2.5. Chebyshev Distance between Node Location Features
In clustering algorithms, the distance metric between features, a.k.a. feature distance, is a crucial factor affecting the quality of power grid partition. To achieve a power grid partition that strictly considers the electrical distance, the feature distance is supposed to reflect the electrical distance between nodes. Typically, clustering algorithms use the Euclidean distance to measure distances between sample points. However, the Euclidean distance between electrical coordinates is not equivalent to the electrical distance between nodes. In this paper, the Chebyshev distance is selected as the distance metric for further improving the quality of the power grid partition.
With electrical coordinates, the feature distance
dist(
xi,
xj) between nodes
i and
j can be represented in the following norm form:
where
p represents the norm,
p = 2 is the Euclidean distance, and
p = ∞ is the Chebyshev distance. As shown in Equation (9), the feature distance
dist(
xi,
xj) is determined by the electrical coordinate
x and
p-norm. Ref. [
40] has proven that the Chebyshev distance between full-dimensional electrical coordinates is equivalent to the electrical distance. That is, Equation (10) holds.
Therefore, full-dimensional electrical coordinates and the Chebyshev distance are employed in the mean shift algorithm to achieve the power grid partition by strictly considering electrical connections. The application of the Chebyshev distance in the mean shift algorithm includes the determination of the high-dimensional spherical region in Equation (6), the distance metric in Equation (7), the length calculation of ||
M(
x)|| in Step 4 of
Section 2.4, and the distance metric between converged density centers in Step 5 of
Section 2.4.
3. Construction of the Input Feature Set for DSA
Multiple DSA models are constructed for the anticipated fault locations in each local region. For a specific fault type and a power grid with a fixed topology, the dynamic security of the power system is determined by pre-fault OCs and fault locations. Input data must include OCs and fault location information. In this paper, pre-fault OCs are represented by steady-state power flow features, which are widely used by existing research. Anticipated fault locations are represented by electrical coordinates.
3.1. Feature Extraction of OCs Based on SDAE
The steady-state power flow features are used to represent pre-fault OCs. These features include the active and reactive power of each generator, the active and reactive power of each load, the voltage magnitude and phase angle of each bus, and the active and reactive transmission power on each line. It is noted that the OCs are intended to describe the operating status of the whole power system rather than a local region. Therefore, steady-state power flow features across the whole power system are adopted. These features are generally high-dimensional and exhibit feature redundancy, deteriorating the training accuracy and efficiency of the data-driven DSA model. Thus, high-level features must be extracted from high-dimensional steady-state features.
SDAE is an unsupervised multilayer neural network that extracts abstract high-level features by stacking multiple denoising autoencoders (DAEs) [
41,
42]. It has a robust feature extraction capability. In this section, SDAE is adopted as an extractor for steady-state power flow features, and the extracted high-level features are used as inputs of each DSA model for different local regions.
Each DAE in the SDAE consists of an encoder and a decoder. The training objective is to minimize the error between the input of the encoder and the reconstruction output of the decoder. The loss function
LD of DAE can be represented as follows [
43]:
where
N represents the number of training samples; the
i-th training sample feature vector, denoted as
Fi, serves as the encoder input; and
zi is the reconstructed feature vector, which serves as the decoder output. The encoder output is the high-level features extracted by DAE. In the SDAE, the DAE in the first layer uses steady-state power flow features as input for training. The outputs of the first layer serve as inputs of the next layer. The SDAE is trained layer by layer. The outputs of the final layer serve as the high-level features, which are used as one part of the DSA model inputs.
3.2. Redundancy Analysis of Electrical Coordinates for Fault Locations in a Local Region
The full-dimensional electrical coordinate of the order
n is used to represent the node location in the mean shift-based power grid partition. However, it is redundant for DSA to represent fault locations in the local region. As shown in
Figure 4, Regions A and B are two interconnected local regions with
l tie lines. A
i and B
i are the nodes at both ends of the tie lines. The electrical distance between nodes P
1 and Q
1 is calculated as follows:
Based on the physical meanings of the mutual impedance,
ZP1Q1 is numerically equal to the voltage at node P with a unit current injected into node Q
1 and no current injected into other nodes, denoted as
ZP1Q1 =
. With a unit current injected into node Q
1, the currents on tie lines are denoted as
k1,
k2, …,
kl. According to the superposition theorem, the voltage at node P
1 is equal to the sum of the voltages at node P
1 with region B removed and currents
k1,
k2, …,
kl, respectively, injected at nodes A
1, …, A
l. It can be represented as follows:
where
ZP1Ai′ represents the mutual impedance between node P
1 and A
i after region B and the tie lines are removed. Substituting Equation (13) into Equation (12) yields the following equation:
Assume there are
m reference nodes located in region B, denoted as Q
1, …, Q
m. The currents on tie lines are denoted as
ki1,
ki2, …,
kil when a unit current is injected into node Q
i. The
m-dimensional electrical coordinate composed of the electrical distances between node P and
m node Q
i, denoted as
xp, can be represented as follows:
Equation (15) can be rewritten into the following form:
where
K(m×(l+1)) is a constant matrix of order
m × (
l + 1);
Cm is an
m-dimensional constant vector; and
Z(l+1) is an (
l + 1)-dimensional vector related to nodes P, A
1, …, A
l. The
m-dimensional electrical coordinate of the node in region A is only related to
Z(l+1). Due to the sparse electrical connections between regions A and B, the number of tie lines is small. The feature dimension
m is greater than
l + 1, resulting in feature redundancy.
To reduce the feature redundancy, the electrical coordinate reference nodes of the fault location in the local region should be selected within that region. The symmetric electrical distance matrix is modified as shown in Equation (17), where
DA and
DB are used to represent the fault locations occurring at regions A and B.
The dimension of steady-state power flow features can be significantly reduced by the feature extraction of SDAE. The dimension of electrical coordinates can also be reduced using block matrices from the symmetric electrical distance matrix to represent fault locations in local regions. Consequently, the dimension of input features for DSA does not drastically increase with the growth of the system scale. The proposed input feature representation method for DSA is applicable in large-scale power systems.
4. Dynamic Security Partition Assessment
The DSA model is constructed based on the RBFNN due to its flexible distance metric. It can leverage the electrical coordinates and Chebyshev distance to distinguish similar samples. To improve the adaptability of DSA model, an incremental updating strategy is designed to ensure its high assessment accuracy for different non-linear transient changes.
4.1. Construction of the RBFNN-Based DSA Model for a Local Region
The anticipated fault locations in the same region have similar impacts on the system dynamic security. The nonlinearity of the input–output map within the training set for each local region is reduced. The map is suitable to be learned by shallow models. RBFNN is a three-layer feedforward neural network with a single hidden layer. It has a simple structure, high training efficiency, and an interpretable decision-making process [
44]. It can approximate any continuous function with arbitrary precision. Therefore, multiple DSA models are constructed based on the RBFNN for the anticipated fault locations in each local region.
As shown in
Figure 5, the input to the RBFNN consists of OC and anticipated fault location features. Each neuron in the RBFNN hidden layer corresponds to a center vector obtained from the training set by the clustering algorithm. The radial basis function
Φ() measures the difference between the sample and the center vector, which is the activation function of the hidden layer neurons. The Gaussian function is a widely used radial basis function, which can be represented as follows [
39]:
where
Fi is the feature vector of the
i-th sample and
Fc is the center vector of the hidden layer neuron.
The feature vector is defined as
F = (
o,
x), where
o are high-level steady-state features of OCs, and
x is the electrical coordinate of the anticipated fault location. The OC features represented by steady-state features are typically treated as Euclidean data, which are measured by the Euclidean distance. The anticipated fault location features, represented by electrical coordinates, are non-Euclidean data, which are measured by the Chebyshev distance. Considering these two types of feature distance metrics,
dist(
Fi,
Fc) is defined as follows:
where α and
β are weight coefficients and α +
β = 1.
Both the mean shift-based grid partition and RBFNN employ the Chebyshev distance, which focuses on the maximum value and disregards other values. The mean shift algorithm and RBFNN expect to measure differences between locations according to electrical distance, and the Chebyshev distance between full-dimensional electrical coordinates is equivalent to the electrical distance. Therefore, the Chebyshev distance is sufficient in this paper.
The Chebyshev distance between electrical coordinates is impacted by the power grid topology. If there is an island region in the power grid, the electrical coordinate and Chebyshev distance will be infinite. Both the mean shift algorithm and RBFNN cannot handle infinite feature values. To ensure the feasibility of the methods proposed in this paper, island regions need to be identified based on the adjacency matrix and removed.
4.2. Online Incremental Update of the DSA Model
The typical mode of current data-driven DSA methods is “offline training, online assessment”. However, the OCs of the power grid are constantly changing, and the training set generated offline cannot fully cover actual OCs and anticipated fault locations. During the online assessment, significant differences between the samples to be evaluated and the training set may lead to low generalization accuracy of the DSA model. To enhance the assessment accuracy of the DSA model, it is significant to generate new training samples to update the DSA model online. This section designs an online incremental updating method of the DSA model based on distance metrics and RBFNN. The new sample set for model updating is denoted as Xnew. The detailed model updating steps are as follows:
- (1)
The new OCs and anticipated fault locations are obtained based on real-time and future prediction information. High-level steady-state features and electrical coordinates are extracted, and an unlabeled new sample set is generated.
- (2)
The minimum feature distance between the new sample and the RBFNN hidden layer center vectors is calculated according to Equation (19). If the minimum exceeds the predefined threshold Dt, a dynamic security label of the new sample is obtained based on a time-domain simulation and stored in Xnew. If the minimum is lower than Dt, the new sample is abandoned.
- (3)
Central samples of Xnew are obtained based on the mean shift algorithm and Equation (19).
- (4)
New neurons in the RBFNN hidden layer are added with central samples from Xnew as the center vectors. The weights between the new neurons and the output layer are trained based on Xnew, while the other structures and weights remain unchanged.
With the above incremental updating method, the DSA model can efficiently acquire knowledge from the new sample set. Moreover, it can avoid catastrophic forgetting during the model updating process. By repeatedly performing incremental updates, the rolling update of the DSA model can be achieved, ensuring that the model consistently maintains high generalization accuracy.
4.3. Framework of the Dynamic Security Partition Assessment
The mode of “offline training, rolling update, online assessment” for the dynamic security partition assessment is adopted. The assessment framework is illustrated in
Figure 6. It mainly includes three blocks: offline training, rolling updating, and online assessment. Each block is elaborated as follows:
- (1)
Offline training provides initial training data and a model for DSA. The power grid is firstly partitioned into several regions based on the mean shift algorithm. Multiple training sets for each region are generated, which include high-level steady-state features extracted by SDAE and a block symmetric electrical distance matrix. Multiple DSA models are constructed based on the RBFNN and each training set. The OC features of the RBFNN models in each region are the same.
- (2)
Rolling updating periodically performs an update of DSA models to guarantee high DSA performance all the time. During the rolling updating stage, new training samples are generated periodically. DSA models are updated online and incrementally based on the proposed updating method. The number of DSA models is not increased, while catastrophic forgetting can be avoided. In this block, T represents the model updating cycle.
- (3)
Online assessment gives the final assessment results. Combinations of OCs and anticipated faults are obtained, and the region where the fault is located is first determined. Next, the steady-state features and electrical coordinates of the combination are extracted. Finally, the DSA model corresponding to the fault location is adopted to obtain the assessment results.
The quality of input data is crucial for the data-driven DSA. Poor data may arise in two stages. (1) In the offline training stage, insufficient training samples and class imbalance may occur. To generate sufficient samples for model training, a wider range of OCs and more anticipated fault locations can be simulated by a time-domain simulation. The oversampling technique can be employed to increase the number of samples in minority classes. (2) In the online assessment stage, the acquired OC data may contain noise. This paper employs SDAE to process OC features, which exhibit strong robustness to noise. The high-level features extracted by SDAE are used as inputs of the RBFNN, reducing the impact of noise on the assessment results of the RBFNN.
The comprehensive flow chart of the proposed data-driven dynamic security partition assessment is shown in
Figure 7. It includes the power grid partition, sample generation, DSA model construction, model updating, and online assessment. The DSA of the power system depends on communication and big data technology. It is vulnerable to false data injection attack (FDIA), which injects false information into the cyber system of power grid and can result in the misjudgments of DSA model. To reduce the risk of FDIA, false data detection is performed for new training samples before the model update. The false data are identified and eliminated, and then the DSA model is updated online for new preprocessed training samples.
The power system DSA includes transient rotor angle stability, voltage stability, and frequency stability assessments. The proposed partition assessment method is applicable to the above three scenarios. In this paper, the transient rotor angle stability assessment is adopted as the application scenario. The output of each regional DSA model provides the transient angle stability results of the whole power system under a combination of OCs and anticipated fault locations. The power system is defined as unstable when the maximum rotor angle difference between synchronous generators exceeds 180° during the transients.
The power grid is partitioned into multiple regions and dedicated DSA models are constructed for fault locations in each region. Each DSA model requires the input of global OC information, and the output is the global dynamic security results for the entire power system. These global data are typically consolidated at the provincial power dispatch center and not shared among the different grid regions. Therefore, the construction, storage, calling, and updating of DSA models are usually carried out at the provincial power dispatch center. The central computing resources are limited. When central computing resources are insufficient, distributed computing technology can be employed. The power dispatch center assigns updating tasks for DSA models to the computing resources in each region, and then the updated model parameters are sent back to the power dispatch center.
5. Experimental Results and Discussions
The effectiveness of the proposed dynamic security partition assessment method is verified in a simplified provincial power system in China. The example system includes 112 buses, 32 generators, and three HVDC links. The transmission power of three HVDC links are 8 GW, 8 GW, and 4 GW. Nine wind farms are integrated into the system. The simulation software is PSSE 34. Numerical simulation and machine learning are performed on a server with 64 GB of RAM and a 24-core Intel Xeon Gold 6148 CPU (2.40 GHz).
5.1. Test of Power Grid Partition
The power grid is partitioned based on the mean shift algorithm integrated with the symmetric electrical distance matrix and Chebyshev distance. The symmetric electrical distance matrix is a 112-order matrix. The radius of the high-dimensional spherical region is set to the average value of the symmetric electrical distance matrix. The bandwidth
σ in the Gaussian weight is set to 20. The iteration termination threshold
ds is set to 0.01. The aggregation threshold
da is a key factor in the partition quality. The electrical modularity
Qe under different aggregation thresholds
da is shown in
Figure 8. The
Qe reaches the maximum when
da = 1.6. Therefore,
da is determined to be 1.6.
The number of buses in different regions before fine-tuning is shown in
Figure 9a. Different colors represent different regions. “32.1% (36)” denotes that the region includes 36 buses and accounts for 32.1% of the total bus count. During fine-tuning process, three small regions where the node number is less than five are aggregated into adjacent regions. The number of buses in different regions after fine-tuning is shown in
Figure 9b. Three small regions are aggregated into adjacent regions. The distributions of the remaining five regions are shown in
Figure 10. Different colors represent different regions. The number represents the region number.
To verify the effectiveness of the Chebyshev distance, mean shift algorithms with different distance metrics are compared in
Figure 11. The distance metrics include Chebyshev, Euclidean, Manhattan, and cosine distances. The results show that the power grid partition quality with the Chebyshev distance metric is the best. It demonstrates that the Chebyshev distance is applicable to the mean shift-based power grid partition.
5.2. Test of Dynamic Security Partition Assessment
Multiple DSA models are constructed for anticipated fault locations in each local region. To generate the dataset for DSA, the time-domain simulations are performed as follows. The power of each load in the system fluctuates randomly between 80% and 120%. The outputs of the generators are adjusted based on load levels and spinning reserve capacity. A total of 1000 OCs are obtained. The anticipated faults are N-1 three-phase short circuits on 10%, 20%, …, 90% of each line. The fault duration is 0.25 s. For each fault location, 100 OCs are randomly selected. Following the above simulation settings, a dataset including 92,700 samples is generated, which includes 50,514 stable samples and 42,186 unstable samples. The number of anticipated fault locations and samples in each region is shown in
Table 1. It is worth noting that the proposed partition method discovers a special region 3, where no stable samples exist. It indicates that the instability risk of transient rotor angle caused by faults in region 3 is high. Due to the lack of stable samples, the faults in region 3 are considered to cause instability without building the specified DSA model.
The steady-state power flow features to represent OCs are extracted by SDAE. The structure of SDAE is (50, 40, 30). The dimension of the extracted high-level features is 30. The symmetric electrical distance matrix of each region is used to represent the regional fault locations. Except for region 3, the samples in other regions are randomly divided into training and test samples at 3:1, respectively. Then, four RBFNN-based DSA models are constructed on the training samples in regions 1, 2, 4, and 5. The hyper-parameters of RBFNN are shown in
Table 2. Center vectors of hidden-layer neurons are obtained from training samples based on the default K-means algorithm.
Figure 12 presents the comparison results of partition assessment and unified assessment with different machine learning models. The unified assessment constructs one model for fault locations in all regions. The machine learning models for comparison include RBFNN, CNN, deep belief network (DBN) [
45], gated recurrent unit (GRU) [
13], RNN [
28], random forest (RF) [
46], and extreme gradient boosting (XGBoost) [
47]. The structures of the above machine learning models are shown in
Table 3. As shown in
Figure 12, the accuracy of the partition assessment is higher than the unified assessment, whatever the machine learning model is. It demonstrates that the proposed partition assessment method is effective.
Table 4 further shows the comparison of the results from different machine learning models for the partition assessment. The accuracy of RBFNN is close to CNN, RNN, and XGBoost. It demonstrates that the constructed RBFNN-based DSA model can achieve a close assessment performance to complex models.
The model for fault locations in region 1 is denoted as “Model 1”. With Model 1 as an example, the effectiveness of the proposed incremental updating method of the DSA model is verified. A total of 4600 new samples are generated. The OCs and fault locations in new samples are not included in old samples. Samples are divided as shown in
Figure 13. Old and new samples are divided into training and test samples at 3:1. The proposed incremental updating method is compared with retraining and continued training. Model 1 with incremental updating is trained on new training samples, and 26 new neurons in the hidden layer are added. The incremental updating is achieved by the freeze of a layer in Keras. The model with retraining is trained on old and new training samples. The training epochs are set to 100. The continued training is designed to train the initial model with new training samples. An assessment of the accuracy of the updated Model 1 on old and new test samples and updating time are shown in
Table 5.
The assessment accuracy with continued training on old test samples is significantly lower than the other two. It demonstrates that continued training leads to catastrophic forgetting of old knowledge. The accuracy with incremental updating on old test samples is the highest, while the assessment accuracy on new test samples is close to the other two. Moreover, the incremental updating process takes only 30 s. Test results demonstrate that the proposed incremental updating method has high updating performance and efficiency while avoiding catastrophic forgetting of old knowledge.
5.3. Tests on a Large-Scale Practical Power System in China
The practical power system is a regional system in China. It includes 2624 buses, 527 generator buses, 919 load buses, 2349 AC lines, two DC lines with 800 kV, eight DC lines with 500 kV, and 1634 transformers.
The power grid is partitioned based on the mean shift algorithm. The number of buses in different regions before fine-tuning is shown in
Figure 14a. During the fine-tuning process, two small regions are aggregated into adjacent regions. The power grid is finally partitioned into three regions. The clustering results of buses in the three-dimensional feature space are shown in
Figure 15, where
d10 represents the electrical coordinate feature of the tenth dimension. Different colors represent different clusters.
To generate the dataset, the time-domain simulations are performed as follows. The level of each load fluctuates randomly between 80% and 120%. The outputs of the generators are adjusted based on the load levels and spinning reserve capacity. A set of 7000 OCs is obtained. The anticipated faults are N-1 three-phase short circuits on each bus except for generator buses. The fault duration is set to 0.25 s. For each anticipated fault location, 50 OCs are randomly selected. A total of 104,850 samples are generated. The sample distribution in each region is shown in
Table 6.
Figure 16 presents a comparison of the results from partition and unified assessments with RBFNN. The accuracy of the partition assessment is significantly higher than that of the unified assessment. It demonstrates that the proposed dynamic security partition method is applicable to large-scale power systems.
6. Conclusions
This paper proposes a data-driven dynamic security partition assessment method for power systems, aiming to develop accurate and incrementally updated DSA models with simple structures. By the mean shift algorithm integrated with a symmetric electrical distance matrix and Chebyshev distance, the power grid is partitioned strictly considering electrical connections. Multiple DSA models are constructed based on the RBFNN and Chebyshev distance, achieving high assessment accuracy with a simple model structure. DSA models are updated online based on the incremental updating strategy, which achieves high updating efficiency and avoids catastrophic forgetting during the updating process. The case study demonstrates that the partition assessment has higher accuracy than the unified assessment. In the partition assessment, the RBFNN-based assessment models have close accuracy to deep learning and ensemble learning models. The proposed incremental updating method can greatly reduce the updating time while maintaining high accuracy.
In addition to pre-fault operating conditions and fault locations, the topology is also a significant factor affecting the dynamic security of the power system. In this paper, the input feature set of the DSA model does not include power grid topology information. The DSA model must be reconstructed to adapt to the change of the power grid topology. The topological change can be reflected by the symmetric electrical distance matrix. A feature representation method of the topology will be studied based on the symmetric electrical distance matrix. Furthermore, the construction of a low-complexity DSA model considering OCs, fault locations, and topology is also the focus of future research.