1. Introduction
Power grid disturbances are caused by various events, including line trips, generator trips, and load disconnections, among others [
1]. The timely detection of these events are significant to avoid severe consequences including large-scale blackout, which can cost up to
$10 billion in economic losses [
2]. In power grid operations, a series of time series data can be obtained through real-time monitoring and recording of the power grid frequency using phasor measurement units (PMUs). The objectives of deploying the PMUs are to [
3]: (i) capture slow spontaneous or anomalous oscillatory swings that are poorly damped; (ii) capture frequency transients from sudden losses of generation or load; (iii) capture power system disturbance data to support analyses of the events; and (iv) develop experience in recognizing disturbances as a precursor to the development of emergent states and unconventional transient state control. One application framework can be depicted as shown in
Figure 1. Specifically, PMUs are deployed closer to the transmission or distribution lines. The data collected are transmitted to a central data storage, where methods presented in the paper can be applied. In the future, advanced PMU technology may incorporate edge computing capabilities such that the methods presented in this paper are embedded into the PMU devices for real-time event detection; thus, eliminating the need for transmitting data to a central location before a detection task can be performed.
The PMUs collect three types of data—frequency, voltage magnitude, and phase angle. In this paper, we consider only the frequency data. Frequency data is revealing because it provides information about the system changes, namely, generation electromechanical transients, generation demand dynamics, and system operations, such as load shedding, break closing, and capacitor bank switching [
4]. By design the power frequency in the United States is 60 Hz (or 50 Hz in other countries). However, the power frequency fluctuates frequently and irregularly throughout the day within an extremely narrow range due to negligible system changes. These variations are due to insignificant perturbations in the system. Consider, for example, the frequency data streams shown in
Figure 2. The data is drawn from single-phase PMU that capture response to a generation loss. The resulting system frequency drop is a sharp decline from steady-state frequency of about 60.01 Hz around the time of 17:50:58 to a quasi-steady state frequency of about 59.93 Hz around the time of 17:51:8. The fluctuation in the data before the sharp drop are characterized as normal fluctuations (insignificant perturbations) that should be treated as parts of the steady state region before the drop. The same is true for fluctuations around the quasi-steady state region after the drop in frequency.
As shown in
Figure 2, abnormal behaviors in power frequency due to disturbances are directly reflected in the PMUs data. Thus, the increasing deployment of PMUs on the power grid is aiding the understanding of power grid dynamics. Consequently, PMU data have been used for wide area situational awareness [
5], disturbance event detection [
6], load control [
7], line outages [
8], and inter-area oscillation analysis [
9]. In this paper, we focus on event detection using data-driven approaches. There have been some work, also, done in this area (e.g., [
10,
11,
12]). However, the proposed framework uses machine committee algorithms to achieve a better detection accuracy. We consider two major detection approaches - event detection and change-point detection.
Event detection in time series refers to finding a point data or a contiguous subsequence in the time series that does not conform to the expected behavior of the system. In power grid, that will mean detecting point data that significantly deviates from the design frequency; therefore, event detection methods will be looking for significant deviations from what constitutes the normal power grid operating frequency. Change-point detection, on the other hand, refers to locating data point in time where there are changes in some aspect of the power frequency distribution. In other words, where the power frequency changes from a somewhat steady state to a somewhat quasi-steady state. We evaluate the proposed approaches using three real-world case studies.
Hence, the main contribution of this paper is four-fold: (1) it presents a machine committee framework for analyzing disturbances in power frequency using PMUs data; (2) it develops a machine committee algorithm that uses five event detection methods to detect anomalous data points in PMUs data; (3) it develops another machine committee algorithm that uses two change-point detection methods to detect phase changes in PMUs data; and (4) it conducts an evaluation of the proposed algorithms using three real-world case studies.
The rest of this paper is organized as follows.
Section 2 presents the framework for the machine committee algorithm and discusses the various event detection and change-point detection methods.
Section 3 describes the three real-world case studies and presents the results of the evaluation of the proposed algorithms.
Section 4 discusses the results and their implications for practical scenarios.
Section 5 concludes the paper with a brief summary and discusses plans for future studies.
2. Machine Committee Framework
The proposed Machine Committee framework consists of a group of detection methods, each of which has been widely used in many diverse fields. The framework uses two different machine committee algorithms; one algorithm is based on event detection (ED) methods, while the other algorithm is based on change-point detection (CPD) methods. Specifically, the proposed Event Detection Machine Committee (EDMC) algorithm invokes five basic ED methods to generate detection outputs with different confidence level. The ED methods in the EDMC algorithm perform the same detection task individually and their outputs are combined in a combiner to obtain better event detection performance. On the other hand, the proposed Change-Point Detection Machine Committee (CPDMC) algorithm invokes two basic change-point detection methods to generate detection outputs with different confidence level. The following subsection describes the framework for the EDMC algorithm and the ED methods it uses. There are different ways of combining the individual outputs in the combiner. In this paper, the combiner approach is based on a voting strategy.
2.1. The EDMC Algorithm
Figure 3 shows the framework for the EDMC algorithm. In committee machines, a computational task is solved by using different methods and then combining the detection results of these computations. The idea behind the committee machines is that it generates an aggregated view over a decision of multiple agents which potentially have different weaknesses and advantages. Due to the aggregated vote, these weaknesses are minimized, and the majority vote leads to better results [
13].
Formally, input frequency data in given as [], where n is the number of input data over a time window. The event locations in dataset refer to the data points that do not follow the expected data behavior. The input data will be preprocessed and fed into the next stage, where five ED methods are used. Those methods detect events using the preprocessed data individually with three sets of parameters ([]) and each method produces detection results in three levels: high confidence, mid confidence, and low confidence. By given the parameter p and input data F, prediction results can be given as , where if the data is normal, or if the data is an outlier. Thus, the voting results for an agent is calculated as , where have four potential results: normal data (0), low confidence outlier (1), mid confidence outlier (2), and high confidence outlier (3).
Then, the final detection results are voted from the results of the detection methods. Results with the same confidence level from the five methods are aggregated in this stage. For example, the final high confidence results are voted from the high confident outputs of the five ED methods.
Voting Strategy
An illustration of the voting strategies are represented in
Figure 4. We consider five detection methods and
n time period as shown in
Figure 4a. For each time period, the data classified as anomalies are represented with an
X, while normal data are represented with a 0. We can then say that Detector 1 classified all the data points as anomalies, while Detectors 2 and 5 classified data points at
and
as anomalies. Furthermore, Detectors 3 classified data points at
and
as anomalies, while Detector 4 classified data points at
,
, and
as anomalies. Using a control number represented as
C, we can generate different outputs. If the number of the methods that identify the same data as an anomaly is no less than the control number
C, the data is voted as an event. As an illustration, if we set
C to 1, we can obtain outputs based on the Union voting strategy as shown in
Figure 4b; in this case, time
is the selected output. The output for
C equals to 4 is shown in
Figure 4c; which means that time
is the selected output. The output for
C set to 5 is shown in
Figure 4d; that is, time
is the final output.
2.2. Detection Methods
In this section we describe each of the five ED and two CPD methods implemented in this paper.
2.2.1. Gaussian Anomaly Detection Approach
Gaussian distribution is one common approach for anomaly detection. In this method, data are modeled on a Gaussian distribution and the Cumulative Distribution Probabilities (CDP) of each data points are given by Gaussian distribution function, which is given as:
where
is the mean of the distribution and
is its standard deviation. A set of thresholds are set to determine the outliers. If the probability of a data point is below or above a particular threshold, the data will be detected as an anomaly. Specifically, the probability of the normal data is located in
. Some of the advantages of the Gaussian anomaly detection method include easy interpretation, low calculation time and fair performance. However, it is not an all-rounder; the lack of consideration for the temporal order of data could cause potential information loss.
2.2.2. Nearest Neighbor Approach
The Nearest Neighbor (NN) approach, which is based on a similarity measure, calculates the distance of the k-th nearest neighbor from the data point. The distance depicts the sparseness of neighborhoods of a data. For example, data points with larger nearest neighbor distance typically represent more sparse neighborhoods and more likely they are outliers. We choose three different numbers of nearest neighbors as the parameters of the approach. Then, calculating the mean value of the distances of the data point with the neighbors for each data point. A threshold serves to determine whether a data point is an anomaly or not.
In our method, we use both the Euclidean Distance based NN (NN-E) approach and Mahalanobis Distance based NN (NN-M) approach. The Euclidean distance of points
in one dimension space, which is the length between the two points, is given as:
The Mahalanobis distance is given as:
where
is the mean of the neighbors value, and
is the covariance matrix of the data. The standard Euclidean distance matrix is easy to compute and interpret, but is not always beneficial for distance calculation. The Mahalanobis distance, which takes the correlation of the data into account, may have better performance in some scenarios [
14].
2.2.3. Local Outlier Factor Approach
The local outlier factor (LOF), which is based on the local density, represents the degree of being outlier in this approach. By comparing the local density of a data point to the local densities of its neighbors, points with lower density than their neighbors will be claimed as outliers. The local density is also calculated by the distance matrix. Similar to the NN method, three different numbers of nearest neighbors are set as the parameters of the approach. The advantage of this method is that it can capture the outliers that have short distance with their neighbors but have lower local density comparing with that of the neighbors.
2.2.4. Prophet Approach
While the above methods have their advantages, they all didn’t consider the time series factor, which may contains periodic changes and trends. Prophet is an efficient time series forecasting tool developed by Facebook’s data science team [
15].
Prophet uses a decomposed time series model which contains three model components: trend, seasonality, and holidays [
16]. They are given as:
where
is trend function for non-periodic changes in the time series,
models weekly and yearly periodic changes in the data,
is the holiday function which models the irregular changes in the data, and
is a normal distributed error function representing the changes that can’t be modeled by previous functions.
2.3. The CPDMC Algorithm
Change-points are characterize as abrupt variations in time series data [
17]. Such abrupt changes may represent transitions from one state to another; in power grid frequency data, abrupt changes will represent transition from steady state to quasi-steady state as shown in
Figure 2. CPD is the task of finding where those abrupt changes occur in time series data. CPD algorithms are usually classified as offline or online CPD. The framework for the CPDMC algorithm is similar to that of the EDMC algorithm by replacing the five ED methods with two CPD methods - offline CPD and online CPD methods.
2.3.1. Offline CPD Approach
Offline CPD method considers the entire data set at once, and most appropriate for batch implementation. Thus, the offline CPD-based algorithms look back in time to determine where changes have occurred. Various offline CPD algorithms were developed for different domains [
18]. In this paper, we implement the efficient Bayesian offline CPD of which details can be found in [
19].
Basically, we assume the data can be partitioned into a number of
K segments. The marginal likelihood produced by a single model
m for data from time
s to
t is given as:
If the segment include data that are generated from different model types or parameters as it grows, the marginal likelihood will drop which suggests that a change-point and two models should be applied [
6].
2.3.2. Online CPD Approach
Online CPD approach, on the other hand, processes data in real-time; that is, as each data point becomes available. The goal is to detect a change point as soon as possible after it occurs, ideally before the next data point arrives [
17].
Adams and Mackay present a Bayesian CPD for online inference in their work [
20]. By generating an accurate prediction of the next data in the sequence, they used a causal predictive filtering rather than segmentation methods of offline CPD. Intuitively, the predictive probability of next unseen data based on the existing data is calculated. If the next data has a large margin with the prediction, it will be claimed as a change-point data. The predictive probability is calculated by the marginal predictive distribution, which is given as:
where
is the given run length, which is the time steps since the last change-point data.
3. Evaluation
In this section, an evaluation of the proposed EDMC and CPDMC algorithms are presented using three real-world case studies. We start with the enumeration of the parameters used for the experiments; then, description of the synchrophasor data used for the evaluation; and presentation of the results of the evaluation.
3.1. Parameters for the Methods
Table 1 shows the parameters used for each methods in the experiments. For each method, three sets of parameters were selected. For the EDMC Combiner,
C is set to 2 for the voting strategy; while for the CPDMC Combiner,
C is set to 1 for the final CP probabilities.
The threshold for the Gaussian approach is user defined and it can be regarded as the user p-value. In this paper, we chose [0.01, 0.05, 0.1] as the three thresholds. Assuming that the data follow normal distribution, then the CDP of outliers is less than 0.1 or larger than 0.9. Consequently, the CDP for high confidence outliers is less than 0.01 and larger than 0.99; the CDP for mid confidence outliers is in [0.01, 0.05] or in [0.95, 0.99] range; and the CDP for low confidence outliers is in [0.05, 0.1] or [0.90, 0.95] range.
For the NN method, the detection thresholds are 0.1, 0.2, 0.3. In this case, the threshold is the percentage of the number of the nearest neighbors used in the method. Through a trial and error method, we found that when the threshold is more than 0.3, the results have fewer differences. By choosing the threshold as 0.1, 0.2 and 0.3, the method shows different performances, which is suitable for using a voting system.
The reason for choosing the parameter used for LOF is very similar to that of NN. In order to obtain different performances of the LOF method, the parameter used are 0.3, 0.5, and 0.7 based on empirical experiments.
For the Prophet method, the internal width is set to 0.99 so that the full boundary of method could be used. Furthermore, we used different range of historical data for estimating the trend in the data. Specifically, 10%, 30%, and 50% of the data were used to obtain different performances.
The parameters for the CPD approaches are based on heuristic. We observed that the probability for the normal data is no more than 0.1 and the probability for high confidence CP is always larger than 0.5. Then we used 0.25 to distinguish between the low confidence and mid-confidence results. This subsection describes these CPD methods.
3.2. Synchrophasor Data
The single-phase synchrophasor data, which contain time stamp and frequency value (10 or more measurements per second), is collected by thousands of PMUs that are deployed on the power grid in the USA [
9]. The high volume, velocity, and variety of PMU measurement data make it possible to take advantage of artificial intelligence techniques in applications such as short-time events and faults detection.
For the evaluation of the proposed framework, we consider the following three case studies of real-world disturbances to the power grid.
3.2.1. Case Study 1
In this case study, an event occurred during a large severe storm system on the Eastern Interconnection in the USA on 4 April 2011 (Case study 1 Youtube animation video:
https://www.youtube.com/watch?v=KmK2VMG57gw). We used the data collected from the PMU deployed at the Florida State University for this evaluation.
Figure 5 presents the data and results of the EDMC methods and algorithm. The results are grouped into three classes: high-confidence events (depicted using red color), mid-confidence events (orange color), and low-confidence events (green color).
Figure 5a–e show the results of the individual event detection methods; while,
Figure 5f shows the EDMC results; while
Figure 6 shows the CPDMC results.
3.2.2. Case Study 2
In the second case study, an event occurred when a 1600 MW generator trip caused by the East Coast Earthquake on 23 August 2011 (Case study 2 Youtube animation video:
https://www.youtube.com/watch?v=XUN_h-k8kBg). The data is collected from the PMU deployed at Atlantic City, New Jersey.
Figure 7 shows the results for EDMC algorithm using this data. From the figure, we can see there is a frequency drop starting around 17:50:58 because of the generator trip.
Figure 8 shows the CPDMC results.
3.2.3. Case Study 3
The third case study shows an event that occurred when a 500-kV line connecting Arizona with San Diego tripped following a capacitor switch-out (Case study 3 Youtube animation video:
https://www.youtube.com/watch?v=YsksUyeLu2Y). Approximately 1.4 million people were affected. The PMU that deployed at Arizona State University captured the event, which generated a peak frequency between the time 22:38:10 and 22:38:24 and the results for the EDMC algorithm are shown in
Figure 9; while, the results of the CPDMC results are shown in
Figure 10.
3.3. Performance
We use True Positive Rate (TPR) and False Positive Rate (FPR) for performance comparison for EDMC and individual approaches of EDMC. The
and
are given as:
where
,
,
,
represent the number of true positive, false negative, false positive, and true negative respectively. The TPRs and FPRs of each approach for the three case studies are presented in
Table 2. The best value in each case is identified in red color; while, the worst value is identified in blue color. In Case Study 1, LOF has the best TPR; but it also has the worst FPR. Guassian, NN_E, and NN_M have zero FPR but their TPRs are low. EDMC has a promising TPR with a better FPR. In Case Study 2, EDMC performed the best for both TPR and FPR. In Case Study 3, LOF has the best TPR and the worst FPR which is similar to Case Study 1 results. The FPR of the other basic approaches are not good compared with EDMC. Overall, the performance of EDMC is a good trade off between TPR and FPR.
5. Conclusions
This paper proposes a machine committee framework for power grid disturbance analysis using synchrophasors data. The framework consists of two algorithms—EDMC and CPDMC. Each algorithm is an ensemble-based algorithm that combines different detection methods and automatically combines their outputs using the voting strategy. The EDMC algorithm combines five ED methods; while, the CPDMC algorithm combines two CPD methods. The algorithms were tested using three real-world data sets. From the results of the evaluation, we can conclude that the EDMC algorithm is a better fit for analyzing power grid disturbances recorded by synchrophasors. The CPDMC algorithm generated a lot of false alarms and the probability of detection is very low for event regions.
Our conclusion is limited to the three event cases evaluated in this paper. In the future, additional studies will include more cases with diverse disturbance events. In addition, future studies will include longer time series to understand the effect of the length of the time series on the performance of the algorithms.