Fault Prediction of Transformer Using Machine Learning and DGA

2020 IEEE International Conference on Computing, Power and Communication Technologies (GUCON)
Galgotias University, Greater Noida, UP, India. Oct 2-4, 2020
Fault Prediction of Transformer Using Machine

Learning and DGA
Saravanan D Ahmad Hasan Ajit Singh
School of Electrical, Electronics and School of Electrical, Electronics and School of Electrical, Electronics and
Communication Engineering Communication Engineering Communication Engineering
Galgotias University Galgotias University Galgotias University
saravanan228@gmail.com mehasan.ahmad@gmail.com Ajitsingh041997@gmail.com
Hannan Mansoor Rabindra Nath Shaw

Dept.Computer Science and Engineering School of Electrical, Electronics and
(Assistant professor) Communication Engineering,
Jamia Millia Islamia Galgotias University
mansoorhannan@gmail.com r.n.s@ieee.org
Abstract—The Power Transformer are the most Crucial part (C2H6), carbon di-oxide (CO2), carbon mono-oxide (CO).
of power System and its failure may result in not only According to IEC 60599 the faults related to transformer are
interrupted power supply but also great economic loss. So, it is mainly of five type i.e.
important to monitor transformer health on daily bases. Many
diagnostic techniques are available for this purpose out of which 1. Partial Discharge (PD).
DGA have been an important technique. Although DGA 2. Discharge of low energy (D1).
(Dissolved Gas Analysis) is good technique but it depends mostly
on the expertise on human. Hence it is not the fastest faults 3. High energy Discharge (D2).
diagnostic tool. This paper uses the Multilayer Artificial Neural
Network Model and Support Vector Machine Classifier Model in 4. Thermal Fault below 300°C (T1) and Thermal Fault
order to predict the Fault condition of transformer using DGA above 300°C but below 700°C (T2).
(Dissolved Gas Analysis) Data. The Support Vector Machine 5. Thermal Fault above 700°C (T1).
Classifier has shown better results around 81.4% then the
Multilayer Artificial Neural Network which give prediction Although DGA has proved an important method but its
result accuracy around 76 %. reliance on expertise is a great problem[4].So, it is important to
come up with more reliable method which is self-sustaining
Keywords—DGA (Dissolved Gas Analysis), MLANN, Support and do not require much human expertise i.e. an intelligent
Vector Machine Classifier, Fault prediction, Data-pre-processing. condition monitoring system. A lot of experiments and
researches were carried out to achieve this goal and machine
I. INTRODUCTION
Learning had proved an ideal tool in achieving the goal.
Power transformer are the most crucial part of power
System. It is often considered as the backbone of system. It The author utilized the machine learning algorithm for the
operates on the principle of electromagnetic induction. Power classification of faults using DGA dataset[5]. The algorithms
transformers covers the large percentage of cost of whole used are Multilayer Neural Network, Support Vector
network. These are the most expensive element of power Classification. The available DGA data is used with these
system. The failure of transformers may lead to huge effect on leaning algorithms in order to come up with models. These
industrial, commercial and individual consumers. models will give us the hypothesis (The mapping function
between features (i.e. inputs) and outputs)[6].
So, in order to insure un-interrupted and smooth working of
transformers, condition monitoring of transformer is very This article is organized into 5 sections as follows. Section
important tool that can be applied at a very reasonable cost for 1 Introduction, Section 2 Preparation for Model construction
large expensive transformers[1]. While considering about briefly describe data available and preprocessing Section 3
health of power transformer there are various components that Implementation describe various Implementation models
come in play. Fault in transformers depend on various factors, Section 4 Results and comparison Section 5 is Conclusion.
but the gases forming inside the transformers oil is considered II. PREPARATION FOR MODEL CONSTRUCTION:
as one of the main reason[2].
The dataset consists of information related to 329
There are various techniques that are being used for transformers. Dataset have 6- features out of which first five
monitoring but the technique which is widely used far is DGA represents the gases Hydrogen(H2), methane (CH4), acetylene
(Dissolved Gas Analysis). In DGA the gases are extracted from (C2H2), Ethylene (C2H4), ethane (C2H6) in ppm and sixth
the transformer oil tank, and then are analyzed using Gas represent the actual status of transformer at the time of sample
Chromatography[3].This gives us amount of gases present in collection. In this article only five gases were considered for
ppm. Gases generated in transformer oil are hydrogen (H2), the analysis due to its ease of data availability[5]. There are
methane (CH4), acetylene (C2H2), ethylene (C2H4), ethane
978-1-7281-5070-3/20/$31.00 ©2020 IEEE 1
Authorized licensed use limited to: Universidade Estadual de Campinas. Downloaded on June 08,2021 at 18:17:18 UTC from IEEE Xplore. Restrictions apply.
five types of fault according to IEC 60599 and this work has training and testing data using Scikit-learn library [10]. The
consider No Fault as sixth category for constructing the model. training data is selected from the splitted dataset to feed the
The models are trained to classify six categories given in the classifier model. These categorical variables are handled using
table 1: keras library [11] to implement artificial neural network. The
model was trained over the training data using keras library.
TABLE I. FAULT TYPES IN DATASET
FAULT TYPE Acronyms No of

Used Samples
Partial discharge PD 25
Low Energy Discharge D1 75
High Energy Discharge D2 102
Thermal fault low and medium TL 63
temp.(T<700°C)
Thermal fault high temperature TH 57
(T>700°C)
No Fault NF 27
Implementation of the machine learning algorithms are

done using Spyder (IDE) and python 3.5 Programming
language.
A. Data Pre-Processing:
Data preprocessing is a vital part of training a machine
learning model as the state of data can directly affect the
learning. Mostly data is in drawn from various sources which
makes it non uniform and ambiguous. So, it is important to
preprocess data before feeding it to a machine leaning model
[7].
If the data that we are using is having some inadequate or
irrelevant information, then the model may present less
accurate results, or may fail to discover anything of use at all.
Thus, data pre-processing is an important step machine
learning. The pre-processing step is used to resolve several
types of problems such as noise in data, redundancy data,
missing values in data etc. All the machine learning algorithms
Fig. 1. Data Pre-processing
rely heavily on the product of data-prepossessing, which is the
final training set [8].
B. Plete Flow
Data pre-processing includes:
TABLE II. TECHNIQUES APPLIED IN THIS ARTICLE FLOW.
x Loading DGA Dataset.
x Handling missing data. Acronyms Full name
x Handling text labels. DGA Dissolved gas analysis
x Separating dependent and independent variables.
MLANN Multilayer Artificial Neural
x Handling data with categories. Network
x Normalizing the data. SVM Support Vector Machine
x Splitting training and testing data
III. IMPLEMENTATION
Dataset is loaded and is divided into the input features(X)
and output features(y). The input features are gases (acetylene A. Multilayer Artificial Neural Network .
(C2H2), Ethylene (C2H4), Hydrogen(H2), methane (CH4), Artificial neural networks are the computing systems which
Ethane(C2H6)) concentration in ppm and output features are are made up of interconnection of nodes (Neurons). These are
faults type as given in Table 1. These two tasks were done with inspired from the biological brain. Neural networks have
the help of pandas library [9]. shown efficiency in learning from examples without being
The output dataset features as categories represented in explicitly programmed. The basic building block of ANN is the
form of linguistic forms like PD, TH, TL, D1, D2, NF. Later perceptron.
these linguistic forms are converted into numerical label by A collection of neurons made a layer and when many layers
using the Scikit-learn library [10]. Spitting is very important in are connected in such a way that first layer is connected to
order to validate the performance of the machine leaning second layer and second layer is connected to third layer and so
model. The consider data is normalized and it will be split into on until we have a last layer. This connection has a non-linear
activation function between each connection of layers. The first The number of neurons along with the layer and the
layer is called input layer and the last one is called output layer activation function associated with each layer is selected using
whereas the in-between layers are called hidden layers. There the GridSearch tool of the Scikit-learn library [10]. All the
may be more than one hidden layer based on the requirement. possible values of these hyperparameters were applied to
This collection of layers interconnected to each other with the Gridseacrh tool which returns the optimal values.
help of an activation function is called Multilayer Neural
Network. B. Support Vector Machine Classifier.
Support Vector machine is based on the support vectors
TABLE III. ASSOCIATED ACTIVATION FUNCTIONS and was basically developed for the binary classification and
regression mechanism. It was established in order to find the
Layer Activation function most optimal hyper-plane separating the two classes with
Input layer none maximum margin. Support vector machine works on the
First Hidden Layer softplus principle of vectors. It tries out all the different available
Second Hidden Layer relu separation line possible in order to separate two classes and
choose one with the maximum margin.
Output layer softmax
Multilayer neural network has a wide range of applications,

can be adopt to any sector. MLNN are very popular in medical
sector, engineering sector, banking sector etc. The basic
building block of MLNN is the perceptron[12][13][14][15].
Fig. 3. Support Vector Machine

Fig. 2. Multilayer Artificial Neural network.
Multilayer Neural Network classifier model was built using Basic support vector machine was used with linearly
the keras API[11]. Model have one input layer, two hidden separable data but it can also be applied with non-linear data by
layers and one output layer. The input layer has no activation using kernel function. It was observed that some data are not
function while first hidden layer uses softplus activation linearly separable in two-dimensional plane but when the same
function and second hidden layer uses relu (Rectifier linear set of data is observed in higher dimensional plane it was
function) and output layer uses softmax as activation function. possible to separate them linearly. This transformation of the
The back-propagation technique is used for weight update dataset from lower dimensional plane to the higher
along with stochastic gradient descent optimizers Adam [16]. dimensional plane was done with the help of kernel function.
There are various kernel functions available like poly, RBF.
TABLE IV. IMPORTANT COMPONENTS OF MLANN etc. The required kernel functions were selected based on the
dataset. Although Support vector machine was developed for
the binary classification but it can be used for multiclass
Component Value
classification as well, by using two approaches OAA (one
Optimizer Adam Against All) and OAO (One Against One) [6]. The support
vector machine classifier uses the kernel function RBF [17]
Layers 4 which stands for Radial Basic Function. The governing
Neurons 35
equation of RBF is given by:
Epochs 1500 (1)

Batch size 40 SVM classifier model was formed using the LIBSVM and
Scikit-learn libraries [10][18] with the help of python 3.7 in
spyder (IDE). The important parameters of Support Vector
Machine are kernel function, penalty parameter C, and gamma. Other than the overall accuracy the MLANN gives 67%
All the parameters are selected using the GridSearchCV tool correct predicted percentage of fault for Low Energy discharge
The GridSearchCV is a tool of the Scikit-learn library [10] that i.e. from the overall faults predicted as Low Energy Discharge
uses all the possible values of given parameters to train the by MLANN 67 % of them were correctly predicted. Similarly
model. It finds out the accuracy per set of parameters and it for High Energy Discharge correct predicted percentage of
gives out the optimal parameters based on the accuracies, the fault is 80%, for No Faults correct predicted percentage of fault
parameter used in the model are given in following table: is 100%, for Partial discharge correct predicted percentage of
fault is 60%, for thermal fault of low temperature and medium
TABLE V. SVM PARAMETERS temperature correct predicted percentage of fault is 75% and
for thermal fault of high temperature correct predicted
SERIAL NUMBER PARAMETER VALUE percentage of faults is 89%
1. SVM C-SVM
2. C 100
3. Kernel function RBF
4. gamma 0.5
IV. RESULT AND DISCUSSION

Both the Multilayer Artificial Neural Network model and
Support Vector Machine classifier model have done quite well.
Results were depicted using plots [18]. The results were
expressed in the terms were discussed below:
A. Correct predicted percentage of faults:
Correct predicted percentage of faults is defined as the Fig. 5. Correct predicted percentage by MLANN
correct prediction percentage out of total predictions for a
particular type of fault. C. Results of SVM based model:
Support vector machine classifier gives an overall training
B. Results of MLANN based model:
accuracy (accuracy when trained over training data) of about
Multilayer Artificial Neural Network classifier produces an 88.8% and testing accuracy (accuracy obtained when trained
overall training accuracy (accuracy when trained over training over testing data) of about 81.4 %. The comparison between
data) of about 77.8 percent and testing accuracy (accuracy predicted versus actual output is given in figure 5.
obtained when trained over testing data) of about 76 percent.
The actual and predicted result comparison is given in the
figure 4.
Fig. 6. Actual versus predicted output.
Other than the overall accuracy the SVM classifier gives

Fig. 4. Actual verses predicted output. 80% correct predicted percentage of fault for Low Energy
discharge i.e. from the overall faults predicted as Low Energy
Discharge by SVM classifier, 80% of them were correctly [2] R. R. Rogers, “Ieee And IEc Codes To Interpret Incipient Faults In
predicted. Similarly for High Energy Discharge correct Transformers/ Using Gas In Oil Analysis,” IEEE Trans. Electr. Insul.,
1978, doi: 10.1109/TEI.1978.298141.
predicted percentage of fault is 86%, for No Faults correct [3] H. Ma, T. K. Saha, C. Ekanayake, and D. Martin, “Smart transformer for
predicted percentage of fault is 100%, for Partial discharge smart grid - Intelligent framework and techniques for power transformer
correct predicted percentage of fault is 60%, for thermal fault asset management,” IEEE Trans. Smart Grid, 2015, doi:
of low temperature and medium temperature correct predicted 10.1109/TSG.2014.2384501.
[4] M. Duval and J. J. Dukarm, “Improving the reliability of transformer
percentage of fault is 76% and for thermal fault of high gas-in-oil diagnosis,” IEEE Electr. Insul. Mag., 2005, doi:
temperature correct predicted percentage of faults is 90%. 10.1109/MEI.2005.1489986.
[5] M. Duval and A. DePablo, “Interpretation of gas-in-oil analysis using
new IEC publication 60599 and IEC TC 10 databases,” IEEE Electr.
Insul. Mag., 2001, doi: 10.1109/57.917529.
[6] R. Hierons, “Machine learning. Tom M. Mitchell. Published by
McGraw-Hill, Maidenhead, U.K., International Student Edition, 1997.
ISBN: 0-07-115467-1, 414 pages. Price: U.K. £22.99, soft cover.,”
Softw. Testing, Verif. Reliab., 1999, doi: 10.1002/(sici)1099-
1689(199909)9:3<191::aid-stvr184>3.0.co;2-e.
[7] C. M. Bishop, Machine Learning and Pattern Recoginiton. 2006.
[8] S. B. Kotsiantis and D. Kanellopoulos, “Data preprocessing for
supervised leaning,” Int. J. …, 2006, doi: 10.1080/02331931003692557.
[9] W. McKinney, “Data Structures for Statistical Computing in Python,”
Proc. 9th Python Sci. Conf., 2010.
[10] F. Pedregosa et al., “Scikit-learn: Machine learning in Python,” J. Mach.
Learn. Res., 2011.
[11] F. Chollet, “Keras Documentation,” Keras.Io, 2015. .
[12] I. Goodfellow, Y. Bengio, and A. Courville, “Deep Learning - whole
book,” Nature, 2016, doi: 10.1038/nmeth.3707.
[13] Z. Wang, “Artificial Intelligence Applications in the Diagnosis of Power
Transformer Incipient Faults,” Virginia Tech, 2000.
Fig. 7. Correct Predicted percentage by SVM Classifier. [14] M. A. A. Siddique and S. Mehfuz, “Artificial neural networks based
incipient fault diagnosis for power transformers,” in 12th IEEE
V. CONCLUSION International Conference Electronics, Energy, Environment,
Communication, Computer, Control: (E3-C3), INDICON 2015, 2016,
Based on the results, we conclude that this study has doi: 10.1109/INDICON.2015.7443174.
yielded good results in the fault prediction. Both Models [15] P. Naraei, A. Abhari, and A. Sadeghian, “Application of multilayer
Multilayer Artificial Neural Network and Support Vector perceptron neural networks and support vector machines in
Machine classifier were successfully able to classify the faults. classification of healthcare data,” in FTC 2016 - Proceedings of Future
Technologies Conference, 2017, doi: 10.1109/FTC.2016.7821702.
Although Support Vector Machine have shown better result [16] D. P. Kingma and J. L. Ba, “Adam: A method for stochastic
than the Multilayer Artificial Neural Network. In this paper a optimization,” in 3rd International Conference on Learning
Representations, ICLR 2015 - Conference Track Proceedings, 2015.
split method was used to split data into train and validation/test [17] Y. D. Chavhan, B. S. Yelure, and K. N. Tayade, “Speech emotion
set. In future, cross validation can be applied for the same and recognition using RBF kernel of LIBSVM,” in 2nd International
it may give better results. Some different available Machine Conference on Electronics and Communication Systems, ICECS 2015,
learning models can be used for the same purpose and the 2015, doi: 10.1109/ECS.2015.7124760.
[18] C. C. Chang and C. J. Lin, “LIBSVM: A Library for support vector
results may be compared. This work stretches preprocessed machines,” ACM Trans. Intell. Syst. Technol., 2011, doi:
gases concentration as direct input to the Machine learning 10.1145/1961189.1961199.
models for future work. The researchers can use the Gases ratio [19] S. Paul, J. K. Verma, A. Datta, R. N. Shaw, and A. Saikia, “Deep
as input to the Machine learning models for their work. learning and its importance for early signature of neuronal disorders,” in
2018 4th International Conference on Computing Communication and
REFERENCES Automation, ICCCA 2018, 2018, doi: 10.1109/CCAA.2018.8777527.
[20] Milan Kumar, V. M. Shenbagaraman and Ankush Ghosh, “Innovations
[1] M. Wang, A. J. Vandermaar, and K. D. Srivastava, “Review of
in Electrical and Electronic Engineering” Book Chapter, Springer.
condition assessment of power transformers in service,” IEEE Electrical
[ISBN 978-981-15-4691-4, Favorskaya et al (Eds.): Innovations in
Insulation Magazine. 2002, doi: 10.1109/MEI.2002.1161455.
Electrical...]

Fault Prediction of Transformer Using Machine Learning and DGA

Uploaded by

Copyright:

Available Formats

Fault Prediction of Transformer Using Machine Learning and DGA

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Fault Prediction of Transformer Using Machine Learning and DGA

Uploaded by

Copyright:

Available Formats

2020 IEEE International Conference on Computing, Power and Communication Technologies (GUCON)

Galgotias University, Greater Noida, UP, India. Oct 2-4, 2020

Fault Prediction of Transformer Using Machine

Hannan Mansoor Rabindra Nath Shaw

978-1-7281-5070-3/20/$31.00 ©2020 IEEE 1

FAULT TYPE Acronyms No of

Implementation of the machine learning algorithms are

Multilayer neural network has a wide range of applications,

Fig. 3. Support Vector Machine

Epochs 1500 (1)

3. Kernel function RBF

IV. RESULT AND DISCUSSION

Fig. 6. Actual versus predicted output.

Other than the overall accuracy the SVM classifier gives

You might also like