Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3302425.3302447acmotherconferencesArticle/Chapter ViewAbstractPublication PagesacaiConference Proceedingsconference-collections
research-article

Imputation Method of Missing Values for Dissolved Gas Analysis Data Based on Iterative KNN and XGBoost

Published: 21 December 2018 Publication History

Abstract

Power transformers are an important part of the power system. Accurate monitoring of its operating status is particularly important for the normal and stable operation of the entire power system and the timely diagnosis of potential faults. Dissolved Gas Analysis (DGA) can detect and judge the oil-immersed power transformer failure by comparing the dissolved gas content of the power transformer in the normal operating state and the oil in the fault state. However, in the operation process of the grid transformer, the detection data is often missing. This paper proposes an effective method based on iterative KNN and XGBoost method for missing values. Firstly, according to the XGBoost integration tree, there are missing values. Information such as the number of attribute divisions obtained by data set training calculates the importance scores of different attributes to determine the priority of the attributes, and then performs interpolation on the missing values ?in an iterative manner. The experimental results in the case of DGA dataset and different missing rate show that the proposed method is superior to the existing similar methods in accuracy, and the dataset after interpolation has a significant improvement on the classification effect of the classifier.

References

[1]
Zhang R, Du Y, Liu Y., 2010. New Challenges to Power System Planning and Operation of Smart Grid Development in China// International Conference on Power System Technology. IEEE.
[2]
Gang L I, Pu J, Wen F, et al., 2016.A Partial Order Reduction Based Method for Big Data Preprocessing in Smart Grid Environment. Automation of Electric Power Systems.
[3]
Himmelspach L, Conrad S., 2010. Clustering approaches for data with missing values: Comparison and evaluation// International Conference on Digital Information Management. IEEE.
[4]
Song Q, Shepperd M, Chen X, et al., 2008. Can k-NN imputation improve the performance of C4.5 with small software project data sets? A comparative evaluation. Journal of Systems & Software.
[5]
Chen T, Guestrin C., 2016. XGBoost: A Scalable Tree Boosting System.
[6]
Lu Z, Hui Y V. L, 2003.linear interpolator for missing values in time series. Annals of the Institute of Statistical Mathematics.
[7]
Sahri, Z. and Yusof, R., 2014.Support Vector Machine-Based Fault Diagnosis of Power Transformer Using k Nearest-Neighbor Imputed DGA Dataset. Journal of Computer and Communications.
[8]
Conversano C, Siciliano R., 2009. Incremental Tree-Based Missing Data Imputation with Lexicographic Ordering{J}. Journal of Classification.
[9]
Shi W, Zhu Y, Huang T, et al., 2017. An Integrated Data Preprocessing Framework Based on Apache Spark for Fault Diagnosis of Power Grid Equipment{J}. Journal of Signal Processing Systems.
[10]
Yongli Z, Fang W, Lanqin G., 2006. Transformer Fault Diagnosis Based on Naive Bayesian Classifier and SVR{C}// Tencon IEEE Region 10 Conference. IEEE.
[11]
Zhang S, Wu X, Zhu M., 2010. Efficient missing data imputation for supervised learning.
[12]
Zhang S, Jin Z, Zhu X., 2011. Missing data imputation by utilizing information within incomplete instances. Journal of Systems & Software.
[13]
Sahri, Z. and Yusof, R., 2014.Support Vector Machine-Based Fault Diagnosis of Power Transformer Using k Nearest-Neighbor Imputed DGA Dataset. Journal of Computer and Communications.
[14]
Sahri, Z, Yusof, R, Watada, J., 2014. FINNIM: Iterative Imputation of Missing Values in Dissolved Gas Analysis Dataset. Industrial Informatics IEEE Transactions on.
[15]
Yu H, Wu Q, Lu Y, et al., 2017. Research on Fault Diagnosis of Power Transformer Equipment Based on KNN Algorithm{C}// International Conference on Mechatronics and Intelligent Robotics. Springer, Cham.
[16]
https://github.com/Saleh860/DGA
[17]
https://github.com/piotrmirowski/DGA
[18]
https://github.com/srijanee/DGA

Cited By

View all
  • (2024)PEDI-GAN: power equipment data imputation based on generative adversarial networks with auxiliary encoderThe Journal of Supercomputing10.1007/s11227-024-05891-7Online publication date: 3-Feb-2024
  • (2022)A Proposed Framework for Estimating Missing Values in Biofuel Feedstock Selection2022 IEEE 7th International conference for Convergence in Technology (I2CT)10.1109/I2CT54291.2022.9824642(1-8)Online publication date: 7-Apr-2022

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ACAI '18: Proceedings of the 2018 International Conference on Algorithms, Computing and Artificial Intelligence
December 2018
460 pages
ISBN:9781450366250
DOI:10.1145/3302425
© 2018 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

In-Cooperation

  • The Hong Kong Polytechnic: The Hong Kong Polytechnic University
  • City University of Hong Kong: City University of Hong Kong

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 December 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Dissolved Gas Analysis
  2. Interpolation Priority
  3. Iterative KNN
  4. Missing Values

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • State Grid Liaoning Electric Power Supply CO., LTD

Conference

ACAI 2018

Acceptance Rates

ACAI '18 Paper Acceptance Rate 76 of 192 submissions, 40%;
Overall Acceptance Rate 173 of 395 submissions, 44%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)25
  • Downloads (Last 6 weeks)1
Reflects downloads up to 26 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)PEDI-GAN: power equipment data imputation based on generative adversarial networks with auxiliary encoderThe Journal of Supercomputing10.1007/s11227-024-05891-7Online publication date: 3-Feb-2024
  • (2022)A Proposed Framework for Estimating Missing Values in Biofuel Feedstock Selection2022 IEEE 7th International conference for Convergence in Technology (I2CT)10.1109/I2CT54291.2022.9824642(1-8)Online publication date: 7-Apr-2022

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media