Support Vector Machine Based Data Hacking Prediction Using PMU Data

As global reliance on power systems grows due to increasing energy demands and modern consumption patterns, maintaining the stability and reliability of the power grid has become crucial. Power systems are complex and nonlinear, and their operations are continuously evolving, making it difficult and expensive to ensure stability. Traditionally, power systems are designed to handle a single outage at a time.

Uploaded by

International Journal of Innovative Science and Research Technology

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

Support Vector Machine Based Data Hacking Prediction Using PMU Data

Uploaded by

International Journal of Innovative Science and Research Technology

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Volume 9, Issue 8, August – 2024 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24AUG1475

Support Vector Machine based Data Hacking

Prediction using PMU Data
Sushma1
Department of Electrical and Electronics Engineering,
Ghousia College of Engineering, Ramanagara, Karnataka

Amanulla2 Javid Akthar3

Associate Professor Professor and HOD,
Department of Electrical and Electronics Engineering, Department of Eletrical and Electronics Engineering,
Ghousia College of Engineering, Ramanagara, Karnataka Ghousia College of Engineering, Ramanagara, Karnataka.

Abstract:- As global reliance on power systems grows Protocol (BGP), which can allow attackers to reroute data
due to increasing energy demands and modern packets to unintended destinations [1]. To address these
consumption patterns, maintaining the stability and risks, implementing a unique network architecture, despite
reliability of the power grid has become crucial. Power its cost, is crucial. Additionally, enforcing mandatory
systems are complex and nonlinear, and their operations updates for default passwords can help prevent unauthorized
are continuously evolving, making it difficult and access.To counter these security challenges, several methods
expensive to ensure stability. Traditionally, power have been proposed. Principal Component Analysis (PCA)
systems are designed to handle a single outage at a time. and Support Vector Machines (SVM) can be used to
However, recent years have seen several significant identify fraudulent data entries. A data- driven approach
blackouts, each originating from a single failure, which utilizing spatiotemporal relationships in PMU measurements
have been extensively reported. These reports are vital has been suggested to differentiate between real and fake
for mitigating operational risks by strengthening systems power grid events [2]. Enhancing security through bit
against identified high-risk scenarios. While extensive masking has been proposed to ensure data integrity and
research has been conducted on these blackouts, cyber- confidentiality [3]. Developing a cybersecurity research
attacks introduce a new dimension of risk. The advent of simulation testbed within the PMU's allotted time frame has
Phasor Measurement Units (PMUs) has enabled progressed. The simulation application was created by the
centralized monitoring of power system data, allowing University of Illinois at Urbana-Champaign and is both
for more effective fault and cyber-attack detection.This interactive and extensible. There are three customizable
paper proposes a machine learning-based approach to simulators included in this package: a PMU, a PDC, and a
detecting cyber-attacks using PMU data. Given the control center. Moreover, artificial neural networks (ANN)
complexity and volume of power system data, traditional have been widely renowned as a highly utilized method for
mathematical and statistical methods are challenging to classification and prediction, in addition to the previously
implement. Instead, a Support Vector Classification mentioned methodologies[4]. The ANN model can be
(SVC) algorithm is used for binary classification, represented as either a simple feed forward neural network
distinguishing between 'attack' and 'normal' states. The (FNN) or a more intricate deep neural network (DNN)[5].
algorithm is trained on PMU data and evaluated using Their model can be obtained by solving an optimization
metrics such as the AUC-ROC curve and confusion issue, which can be efficiently tackled utilizing various local
matrix, achieving an 82% AUC- ROC score, and global methods such as gradient-based search
demonstrating its effectiveness in identifying cyber- techniques [6], genetic methods [7], and others.
attacks. Unsupervised learning (UL) refers to the extraction of
significant patterns from unlabeled data. This process entails
Keywords:- Cyber Attack; Support Vector Machine; AUC- extracting pertinent attributes, classifications, and
ROC; Support Vector Classification. frameworks straight from the unprocessed data, without any
manual intervention such as labeling or input
I. INTRODUCTION
Artificial neural networks (ANNs), including both
The data transmitted from Phasor Measurement Units simple feedforward neural networks (FNNs) and more
(PMUs) to Phasor Data Concentrators (PDCs) can be easily complex deep neural networks (DNNs), are widely used for
accessed and modified, posing significant security risks. classification and prediction. Optimization techniques such
Although previous attacks have been confined to local area as gradient-based searches and genetic algorithms are
networks (LANs), similar vulnerabilities can be exploited employed to refine ANN models. Unsupervised learning
over wide area networks (WANs) such as the Internet. (UL) methods like Isolation Forests (IF) and Autoencoders
Research has highlighted weaknesses in the Border Gateway (AE) are used to detect anomalies such as false data injection

IJISRT24AUG1475 www.ijisrt.com 2017

Volume 9, Issue 8, August – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24AUG1475

attacks (FDIA) and Denial of Service (DoS) attacks [8], [9], Data preprocessing involves several automated steps,
[10], [11]. Dynamic Bayesian Networks (DBN) are also including anomaly detection, data cleaning, and the
utilized for attack detection [12]. Semi- supervised learning organization of data into balanced and unbalanced datasets.
(SSL) combines labeled and unlabeled data to enhance This process establishes the framework for the fault
detection capabilities. Techniques like semi- supervised prediction model. Automated procedures address data
adversarial autoencoders (SSAA) and generative- impurity and missing values, with mean values used to
adversarial frameworks are proposed for improved FDIA replace missing entries. Given the critical role of fault
detection, with new models such as SS-deep-ID and robust prediction in electrical systems, ensuring the reliability of
semi-supervised prototypical networks (RSSPN) offering the prediction algorithm is paramount.
advanced detection methods (References [13], [14], [15],
[16]. To handle large volumes of data effectively, the method
must offer strong generalization and utilize highly orthogonal
II. METHODOLOGY inputs.

 PMU Dataset Advanced feature engineering techniques may be

The dataset employed for classification consists of needed to improve prediction accuracy, especially if the data
various features, as detailed in Table 1. It encompasses 128 exhibits significant correlations. Developing a data-aware
attributes, with the target variable denoting whether the preprocessing strategy is complex but essential.The
measurement pertains to a ‘fault’ or a ‘normal’ event. The workflow includes dividing the dataset into training and
data originates from Phasor Measurement Units (PMUs), testing subsets, each containing relevant CSV files. For the
which are sophisticated devices designed to capture and PMU cyberattack detection, the target variable indicates
compute electrical waveforms on the power grid by whether an attack has occurred. The machine learning
synchronizing with a standard time reference.Each PMU model's objective is to predict if the PMU data suggests a
records 29 distinct types of measurements, resulting in a total cyberattack.
of 116 measurement columns across four PMUs.
Additionally, the dataset includes three types of logs: relay Table 1 Attack Event Scenarios in Power System
logs, control panel logs, and Snort logs. Relay logs
document the activities of protective relays that monitor
electrical parameters and initiate protective measures to
ensure system safety. Control panel logs capture activities
and statuses from control panels that oversee and manage the
power system. Snort logs come from an open-source
network intrusion detection system that tracks and analyzes
network traffic for malicious activities.In total, the dataset
includes 128 attributes: 116 from the PMU measurements
and 12 from the logs. This comprehensive dataset is used to
train models to classify whether events are "normal" or
indicative of an "attack," with the classification target
labeled as “Marker.”

 SVC based Detection Algorithm

The attack events are due to different cyber attack that
can happen to a power system which include the data injection
in the power system that may cause the relay to operate
without actually having any fault in the power system.
Remote tripping fault is the one which would trip the relay
without any event occurring just by the cyber-attack. This is
called the command injection attack type. Two subtypes of
this attack is command injection in single relay and in
multiple relays. Then the third type of the cyber attack is the
relay setting change attack. This is by making the relay not to
act even when the fault is available. These 128 features or
variables primarily originate from synchrophasors or phasor
measurement units (PMUs). The data was sampled at 120
samples per second, with each scheme simulated for 17
seconds. Different fault details are shown in Table 1. The
fault prediction process comprises four key components:
Data Preprocessing Automation, Outlier Detection and
Feature Engineering, Training and Testing, and Model
Evaluation. The Support Vector Machine (SVM) method is
utilized to enhance the model's generalization capabilities.

IJISRT24AUG1475 www.ijisrt.com 2018

Volume 9, Issue 8, August – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24AUG1475

III. RESULTS AND DISCUSSION The performance of the cyber attack detection
implementation is found to be satisfactory with 0.82 as the
To generate box plots for the first 14 columns of area under the curve. It is a measure of how many correct
numerical data from a dataset containing Phasor Measurement classification can happen in the machine learning algorithm.
Unit (PMU) data the sea born library from python is used and It infers that above 80% of the classification is correct. On
they are shown as follows. Since it is a classification further tuning the algorithm the performance can be improved.
algorithm the amount of majority and minority class has to From the analysis thus developed the performance metrics is
be checked whether it is balanced or imbalanced . The class as given in the table 2.
distribution graph for the PMU considered is as given in the
Figure 1 Table 2 Performance Metrics
Accuracy 0.76
Precision 0.69
Recall (Sensitivity): 0.94
F1 Score 0.80
Specificity 0.58

IV. CONCLUSION

This work presents a machine learning-driven

approach to cyber-attack detection in power system. Support
vector classifier based implementation is carried out to
classify the events from the PMU data gathered from the
power system. The dataset of the PMU data having 32
attributes from four such PMUs are used in the prediction
implementation. The imbalance in the data is treated by
taking the majority class data to be equal to the number of
minority class data for better performance. The findings
Fig 1 Class Distribution Graph demonstrate that the support vector machine approach
greatly enhances performance in the identifying cyber attack
The confusion matrix is a fundamental tool for assessing detection. The method's remarkable 82% accuracy rating
the performance of classification models. It provides underscores its promise for dependable detection algorithm.
detailed insights into how well the model predicts each
class, helping identify strengths and weaknesses. By REFERENCES
understanding and utilizing the confusion matrix, you can
make informed decisions about model improvements and [1]. internet.http://www.wired.com/threatlevel/2013/12/bg
better evaluate its effectiveness. The confusion matrix for the p-hijacking-belarus-iceland/. Accessed: 2022-12-15.
implementation of cyber attack detection is as given in the [2]. Q. Sun, L. Shi, Y. Ni, D. Si, and J. Zhu, ‘‘An
Figure 2. The label ‘1’ is the Natural class and ‘0’ is the enhanced cascading failure model integrating data
attack class. Since the attack class is the minority class the mining technique,’’ Protection Control Mod. Power
samples of the majority class was brought to be equal to the Syst., vol. 2, no. 1, pp. 209–219, Jan. 2017.
majority class to make it a balanced data. [3]. R. Vijayanand, D. Devaraj, B. Kannapiran, and K.
Kartheeban, ‘‘Bit masking based secure data
aggregation technique for Advanced Metering
Infrastructure in Smart Grid system,’’ in Proc. Int.
Conf. Comput. Commun. Inform., Jan. 2016, pp. 45–
54.
[4]. T. J. Overbye, Z. Mao, K. S. Shetye, and J. D.
Weber, ‘‘An interactive, extensible environment for
power system simulation on the PMU time frame
with a cyber security application,’’ in Proc. IEEE
Power Energy Conf., Feb. 2017, pp. 1–6.
[5]. Z. Mao, T. Xu, and T. J. Overbye, ‘‘Real-time
detection of malicious PMU data,’’ in Proc. Int.
Conf. Intell. Syst. Appl. Power Syst., Sep. 2017, pp.
121–128.
[6]. https://www.kaggle.com/bachirbarika/power-system

Fig 2 Confusion Matrix

IJISRT24AUG1475 www.ijisrt.com 2019

Volume 9, Issue 8, August – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24AUG1475

[7]. S. Wang, M. Roger, J. Sarrazin et al.,

“Hyperparameter optimization of two-hidden-layer
neural networks for power amplifiers behavioral
modeling using genetic algorithms,” IEEE
Microwave and Wireless Components Letters, vol.
29, no. 12, pp. 802-805, Dec. 2019
[8]. S. Ahmed, Y. Lee, S. Hyun et al., “Unsupervised
machine learningbased detection of covert data
integrity assault in smart grid networks utilizing
isolation forest,” IEEE Transactions on Information
Forensics and Security, vol. 14, no. 10, pp. 2765-
2777, Oct. 2019.
[9]. J. Wang, D. Shi, Y. Li et al., “Distributed framework
for detecting PMU data manipulation attacks with
deep autoencoders,” IEEE Trans‐ actions on Smart
Grid, vol. 10, no. 4, pp. 4401-4410, Jul. 2019.
[10]. M. Aboelwafa, K. Seddik, M. Eldefrawy et al., “A
machine-learningbased technique for false data
injection attacks detection in industrial IoT,” IEEE
Internet of Things Journal, vol. 7, no. 9, pp. 8462-
8471, Sept. 2020.
[11]. K. Lu, G. Zeng, X. Luo et al., “Evolutionary deep
belief network for cyber-attack detection in industrial
automation and control system,” IEEE Transactions
on Industrial Informatics, vol. 17, no. 11, pp. 7618-
7627,Nov. 2021
[12]. I. Sohn, “Deep belief network based intrusion
detection techniques: a survey,” Expert Systems with
Applications, vol. 167, pp. 1-9, Apr. 2021.
[13]. Y. Zhang, J. Wang and B. Chen, “Detecting false
data injection at‐ tacks in smart grids: A semi-
supervised deep learning approach,” IEEE
Transactions on Smart Grid, vol. 12, no. 1, pp. 623-
634, Jan. 2021.
[14]. M. Farajzadeh-Zanjani, E. Hallaji, R. Razavi-Far et
al., “Adversarial semi-supervised learning for
diagnosing faults and attacks in power grids,” IEEE
Transactions on Smart Grid, vol. 12, no. 4, pp. 3468-
3478, Jul. 2021.
[15]. M. Abdel-Basset, H. Hawash, R. Chakrabortty et al.,
“Semi-super‐ vised spatiotemporal deep learning for
intrusions detection in IoT net‐ works,” IEEE
Internet of Things Journal, vol. 8, no. 15, pp. 12251-
12265, Aug. 2021
[16]. T. Zheng, Y. Liu, Y. Yan et al., “RSSPN: robust
semi-supervised proto‐ typical network for fault root
cause classification in power distribution systems,”
IEEE Transactions on Power Delivery, Nov. 2021.
DOI: