Data-Driven Dynamics Learning on Time Simulation of SF6 HVDC-GIS Conical Solid Insulators

Urazaki Junior, Kenji; Lucchini, Francesco; Marconato, Nicolò

doi:10.3390/electronics14030616

Open AccessArticle

Data-Driven Dynamics Learning on Time Simulation of SF₆ HVDC-GIS Conical Solid Insulators

by

Kenji Urazaki Junior

¹

,

Francesco Lucchini

^1,2

and

Nicolò Marconato

^1,2,*

¹

Department of Industrial Engineering, University of Padova, Via Gradenigo 6/A, 35131 Padova, Italy

²

Consorzio RFX (CNR, ENEA, INFN, Università di Padova, Acciaierie Venete SpA), Corso Stati Uniti 4, 35127 Padova, Italy

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(3), 616; https://doi.org/10.3390/electronics14030616

Submission received: 8 January 2025 / Revised: 28 January 2025 / Accepted: 2 February 2025 / Published: 5 February 2025

(This article belongs to the Special Issue Applications of Machine Learning and Artificial Intelligence in Modern Power and Energy Systems)

Download

Browse Figures

Versions Notes

Abstract

:

An HVDC-GIL system with a conical spacer in a radioactive environment is studied in this work using simulated data on COMSOL^® Multiphysics. Electromagnetic simulations on a 2D model were performed with varying ion-pair generation rates and potential applied to the system. This article explores machine learning methods to derive time to steady state, dark current, gas conductivity, and surface charge density expressions. The focus was on constructing symbolic representations, which could be interpretable and less prone to overfitting, using the symbolic regression (SR) and sparse identification of nonlinear dynamics (SINDy) algorithms. The study successfully derived the intended expressions, demonstrating the power of symbolic regression. Predictions of dark currents in the gas–ground electrode interface reported an absolute error and mean absolute percentage error (MAPE) of 1.04 ×

10^{- 4}

pA and 0.01%, respectively. The solid–ground electrode interface reported an error of 8.99 ×

10^{- 5}

pA and MAPE of 0.04%, showing strong agreement with simulation data. Expressions for time to steady state had a test error of approximately 110 h with MAPE of around 3%. Steady-state gas conductivity expression achieved an absolute error of 0.55 log(S/m) and MAPE of 1%. An interpretable equation was created with SINDy to model the time evolution of surface charge density, achieving a root mean squared error of 1.12 nC/m²/s across time-series data. These results demonstrate the capability of SR and SINDy to provide interpretable and computationally efficient alternatives to time-consuming numerical simulations of HVDC systems under radiation conditions. While the model provides useful insights, performance and practical applications of the expressions can improve with more diverse datasets, which might include experimental data in the future.

Keywords:

HVDC; gas insulated transmission lines; solid-gas surface charging; SF₆; physics-informed machine learning; SINDy; SR

1. Introduction

High-voltage direct current (HVDC) gas-insulated lines (GILs) enable the transmission of power over long distances with high system stability, with advantages such as ecological aspect and economic benefits and are increasingly attractive due to the energy turnaround that requires the transport of energy in some cases for hundreds of kilometers [1,2].

Positive and negative ion pairs are responsible for the gas conductivity. Under normal circumstances, natural background radiation generates these ion pairs. SF₆ gas insulation systems have a big issue when operating in the presence of radiation, such as near a tokamak, which emits neutrons, X-rays, and gamma rays. Radiation increases the ion-pair generation of the gas by ionization and dissociation, consequently increasing the conductivity of the gas [3,4,5,6].

The increase in conductivity affects the system’s electric field distribution and the insulators’ surface charge accumulation, resulting in a modification of the time to a steady state and the breakdown voltage, for example. It gives rise to dark currents, i.e., leakage current through the insulating system flowing out of the ground electrode, which can cause power losses, heating, and breakdowns [6,7]. Therefore, the design and operation of HVDC-GIS systems must consider radiation-induced effects. In modeling and simulation, a source parameter denoted as S accounts for the ionization process, representing the volumetric rate of ion-pair generation [3].

Many scenarios of radiation fields are possible. For example, in the ITER Tokamak, there are two orders of magnitude uncertainties in the radiation to which the gas insulator will be subjected, from 0.02 Gy/s to 2 Gy/s [4]. A. De Lorenzi et al. (2009) [8] suggested that more detailed modeling of the RIC (Radiation Induced Conductivity) phenomenon is needed.

Besides the ionization of the gas, the main issue that affects HVDC-GIS systems is the accumulation of electric charges on the interface between the gas and solid insulators, which may lead to a decrease in the flashover voltage and breakdowns [1]. The dynamics of the surface charge accumulation is a critical factor in the design of the systems, and the enhanced SF₆ conductivity caused by radiation modifies the dynamics and increases the hetero-charge distribution in the insulator surface [8].

The investigations in this work use data-driven methods to learn the dynamics and relations in HVDC-GIS using SF₆ as gas insulation in different radiation conditions. The data were collected through simulations in COMSOL Multiphysics^® using the electric currents and transport diffusion of diluted species physics, as described in [9]. Migration of charges in the solid insulator and gas models the surface charge accumulation; the models did not implement surface conductivity for simplicity but can be easily implemented.

The systems studied were at 500 kV DC, the operating voltage of the Neutral Beam Injector of the Divertor Tokamak Test (DTT) facility [3,10]. The parameters that varied in the simulations were the ion-pair generation rate and the potential applied.

The ML (machine learning) models were trained and validated with simulation data. The SR [11] and SINDy [12,13,14,15] algorithms were evaluated for learning the dynamics of surface charge and predicting time to steady state, conductivity, and dark current at steady state. The SR algorithm aimed to find the best functional form to predict the target variable, while the SINDy one sought to find the best sparse regression of the dynamics of the target variable. The present work employed both methods and evaluated the results.

Previous works in obtaining an analytical solution for the steady-state time for the system demonstrated that the expression computed fails when the conductivity of the gas is affected by radiation, i.e., the ion-pair generation parameter S is high. The difference between the analysis time and the time computed by the analytical expression was in one order of magnitude [8].

Building an analytical expression for the steady-state conditions aims to provide important information in advance without performing simulations. Further works can use dynamic models built with SINDy, such as optimizing the design of HVDC structures in RIC conditions. The optimization of electromagnetic problems is a very demanding task, as the computational cost is high and the function to be optimized is unknown [8]. Simulation and optimization using surrogate models built with ML, such as the surface charge density dynamics, can reduce the time needed to obtain a feasible solution.

Data-driven strategies for modeling physical and engineering systems have often proven successful. Machine learning and artificial intelligence can tackle complex, large datasets that normally appear in engineering and physical problems, as it is the case of the system evaluated in this work. The data-driven framework can help discover missing physics [16], improve control [17], find reduced models [18,19], and model discrepancies between the real systems and simulation results that use first-principle laws [16].

A practical example of how analytical expressions might accelerate simulations in HVDC RIC systems was presented in the paper [9]. The authors demonstrated the feasibility of simplifying the drift diffusion reaction (DDR) equations of ions in the gas using a conductivity analytical expression. The analytical expression is a compromise between the most detailed simulation using DDR equations and the simplest one using a constant gas conductivity, which does not work when RIC conditions occur in the system.

Although many works on HVDC using machine learning exist, they mainly focus on monitoring [20], control [21,22], partial discharges [23,24,25], and fault detection and prediction [26,27,28,29] in the systems. One previous work in the literature involves directly predicting the flashover voltage of an AC gas-insulated system, but it uses experimental data for learning [30]. New forms of applying machine learning to the HVDC-GIL system, such as the present work, might bring new paths for optimizations using surrogate models, fast assessments with analytical expressions, and learning and understanding more of the systems with the derived interpretable representations, especially in high-radiation environments.

The main contributions of this work are summarized as follows:

Symbolic expressions for key variables: Developed interpretable symbolic models for dark currents, gas conductivity, and time to steady state using symbolic regression.
Symbolic dynamic expression: Applied SINDy to derive a sparse surface charge density model, resulting in an interpretable expression that models a complicated phenomenon evolution in time.
Method’s applicability: Demonstrated how machine learning-derived models can be obtained from simulated data, showcasing their capability in high-complexity HVDC systems. Extensions for experimental data are trivial.

2. Electromagnetic Modeling

This section recalls the approach for electromagnetic (EM) modeling of HVDC gas-insulated equipment, such as HVDC-GIS and HVDC-GIL. The capacitive to the resistive transition of the electric field following the application of DC voltage is described by the electro-quasistatic (EQS) approximation of Maxwell equations. Particularly, during the time evolution, the surface charge density

ϱ_{s}

accumulates along the interface between dielectric gas and solid insulator according to the law [31]:

\frac{d ϱ_{s}}{d t} = \hat{n} \cdot (J_{i} - J_{g}) - \nabla \cdot (σ_{s} E_{t}),

(1)

where

\hat{n}

is the normal vector pointing out from the solid insulator,

J_{i}

and

J_{g}

the current densities within the insulator and the gas,

σ_{s}

is the interface electric conductivity, and

E_{t}

the electric field tangent to the interface. It is well recognized that the accumulation of

ϱ_{s}

is related to the main fault phenomena within HVDC gas-insulated systems [32].

A complete description of the electric field in the dielectric gas (

E_{G}

) requires the solution of a self-consistent model, described by the following set of Partial Differential Equations (PDEs) [1,33]:

\begin{matrix} \frac{\partial n^{+}}{\partial t} = S - R n^{+} n^{-} - \nabla \cdot (n^{+} μ^{+} E_{G}) + D^{+} Δ n^{+} \end{matrix}

(2)

\begin{matrix} \frac{\partial n^{-}}{\partial t} = S - R n^{+} n^{-} + \nabla \cdot (n^{-} μ^{-} E_{G}) + D^{-} Δ n^{-} \end{matrix}

(3)

\begin{matrix} \nabla \cdot E_{G} = \frac{e (n^{+} - n^{-})}{ε_{G}} . \end{matrix}

(4)

Equations (2) and (3) describe the generation, drift, diffusion, and recombination of positive (

n^{+}

) and negative (

n^{-}

) ion number densities through a set of transport parameters: the mobility (

μ

), diffusion coefficient (D), and the recombination ratio R. The source of ion pairs is driven by the parameter S, e is the elementary charge and

ε_{G}

is the permittivity of the gas. A detailed description of the boundary and initial conditions for the solution of the model, as well as the coupling with the model for the conduction in the solid insulator, is out of the scope of the paper; however, the interested reader can find details, e.g., in [34,35].

Combining Equations (2) and (3), the expression of the current density in the gas

J_{G}

is inferred:

J_{G} = e [E_{G} (μ^{+} n^{+} + μ^{-} n^{-}) - \nabla (D^{+} n^{+} - D^{-} n^{-})] .

(5)

3. Machine Learning Methods

This section briefly describes the machine learning methods. Sparse Identification of Nonlinear Dynamics and symbolic regression search for a symbolic representation to fit the target variable. This learning process uses data that might be generated by simulations or collected in experiments and operations. This work uses only simulated data from a physical model. The methods permit the search for parsimonious models, either by promoting the sparseness of the fitted coefficients or evaluating the expression’s complexity. The learning tasks are supervised, i.e., the target is known and provided to the algorithm. The model space comprises analytic expressions that use predictive variables collected or generated simultaneously with the target.

3.1. SINDy

The SINDy (Sparse Identification of Nonlinear Dynamics) used in this work was proposed by [13] to find the governing equations from measurement data of nonlinear dynamical systems. The method uses machine learning and promotes the sparsity of the parameters, determining the fewest terms sufficient to describe the observed data accurately. It might be classified as a symbolic regression method, focusing on fitting polynomial expressions to describe derivative data, as presented in Equation (6), assuming that only a few terms are essential for the regression to build low-order models that tend to generalize and avoid overfitting [14,36].

Later, robust dynamics learning with rational polynomial candidates was proposed and developed by [36], increasing the robustness of the algorithm to noise, and constrained SINDy was proposed and developed by [14], which enables the constrain on the coefficients given a prior knowledge, for example, from physics. The SINDy method was successfully applied for fluid dynamics combined with model order reduction (POD) [14], complex mechanical problems [13,36], and reaction kinetics [36]. The identified models are parsimonious, accurately equilibrating complexity, and remain interpretable as a symbolic representation [14].

\frac{d}{d t} x (t) = f (x (t)) .

(6)

The sparsity-promoting characteristics of the method make it a good algorithm for finding simplified or reduced models of apparently complicated/complex systems [13,14]. Equation (7) presents the general regression of the nonlinear dynamical system, and an example of a sparse, regularized, regression problem to find

Ξ

, using LASSO, is presented in Equation (8) [13].

\dot{X} = Θ (X) Ξ,

(7)

where

Θ

∈

R^{m \times p}

represents the feature library, which is the candidate functions to describe the evolution in time of the system; X∈

R^{m \times n}

is the matrix with the row vector of state

x (t)

∈

R^{n}

at each

t = [t_{1}, t_{2}, \dots, t_{m}]

;

Ξ = [ξ_{1} ξ_{2} \dots ξ_{n}]

are the vector coefficients

ξ

∈

R^{p}

ξ = arg min_{ξ^{'}} (∥ Θ ξ^{'} {- y ∥}_{2} + λ {∥ ξ^{'} ∥}_{1}),

(8)

where

λ

weights the sparsity constraint of the optimization. Each column of

\dot{X}

is learned separately to find a sparse vector of coefficients for each variable of interest.

The state vector

x (t)

∈

R^{n}

can be collected during the simulation or from measurement data, sampled at different times

t_{1}, t_{2}, \dots, t_{m}

. The derivative of the interested dynamic state comes from the time-series data collected, and the values of the state of the system at the time t build the feature library [13]. An example of a feature library is presented in matrix (9), where each column in

Θ (X)

represents a candidate function for the right-hand side on Equation (6).

Θ (X) = [\begin{matrix} | & | & | & | & | & | & | & | & | \\ 1 & X & X^{P_{2}} & X^{P_{3}} & \dots & sin (X) & cos (X) & sin (2 X) & cos (2 X) \dots \\ | & | & | & | & | & | & | & | & | \end{matrix}],

(9)

where

X^{P_{α}}

represents all candidate polynomial functions of order

α

. For example, matrix (10) presents the library when

α = 2

.

X^{P_{2}} = [\begin{matrix} x_{1}^{2} (t_{1}) & x_{1} (t_{1}) x_{2} (t_{1}) & \dots & x_{2}^{2} (t_{1}) & x_{2} (t_{1}) x_{3} (t_{1}) & \dots & x_{n}^{2} (t_{1}) \\ x_{1}^{2} (t_{2}) & x_{1} (t_{2}) x_{2} (t_{2}) & \dots & x_{2}^{2} (t_{2}) & x_{2} (t_{2}) x_{3} (t_{2}) & \dots & x_{n}^{2} (t_{2}) \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋮ & ⋱ & ⋮ \\ x_{1}^{2} (t_{m}) & x_{1} (t_{m}) x_{2} (t_{m}) & \dots & x_{2}^{2} (t_{m}) & x_{2} (t_{m}) x_{3} (t_{m}) & \dots & x_{n}^{2} (t_{m}) \end{matrix}] .

(10)

The feature library can be a nonlinear function built and chosen before optimization. No assumptions or priors are needed to obtain a sparse system representation. During the fitting of

Ξ

, the optimization algorithm promotes the system’s sparseness. The algorithm for learning the coefficients is the sequential thresholded least-squares (STLSs) [13]. If the feature library consists only of linear terms and the optimization is not penalized, the SINDy algorithm reduces to the dynamic mode decomposition one [14].

The sequential thresholded least-squares (STLSs) algorithm begins with a least-squares solution for

Ξ

, where all coefficients smaller than a cutoff value

λ

are thresholded. This thresholding process repeats iteratively until the non-zero coefficients converge. The method is computationally efficient, rapidly achieving a sparse solution in only a few iterations. Furthermore, it is remarkably robust to noise, performing well even when the derivatives come from noisy data [13,14]. The best

λ

and the model selection should be chosen by varying the parameter and searching at the Pareto front [14].

As evident from the algorithm, the

λ

cutoff threshold controls the number of predictive variables in the model, bringing a tradeoff between complexity and accuracy in the learned model. Generally, the higher the

λ

, the lower the complexity and accuracy; it might be underfitting the data in this case. Choosing simpler models with a sparse matrix of coefficients and good accuracy is advisable. Physical systems generally have only a few relevant terms in the governing Equations [13], and this enables the bypassing of the intractable brute force search to sweep through all possible model structures [14] by promoting the sparseness in the solution.

This work uses the PySINDy library, which implements different SINDy methods readily available in Python [37,38].

3.2. Symbolic Regression

Discovering equations from observed data is one of the pillars of science. Symbolic regression, a machine learning algorithm that discovers generalizable and interpretable symbolic models, can automate this process.

The software used in this article is the package described in [11], the library PySR that implements an efficient and distributed symbolic regression. PySR is a generalizable and open-source algorithm that can be passed custom losses and operators to the optimization process by default, aimed at “accelerating the discovery of interpretable symbolic models from data” [11].

In [11], implementation, a genetic algorithm searches for potential equations with a multi-population evolution in a loop of evolve–simplify–optimize. Although real-value coefficients appear in the expressions, they are optimized using classical methods. The genetic algorithm is used to optimize the presence of operations, represented by integers. This way, symbolic regression is optimized, as discovering equations from data is complicated and time-consuming [11]. Figure 1 presents a tree-based representation of the expression

1.15 y + 0.86

, where ‘+’ and ‘*’ nodes are the binary operators of addition and multiplication, respectively. A “node” in a tree represents each operation and operand of an analytical expression. The default complexity used for the fittings in this article in PySR is the total number of nodes in the expression independently of its content [11].

The generalization and interpretability of the models are promoted by jointly minimizing the prediction error and the model complexity. A penalization term on the loss function related to the frecency of the complexity of the expression promotes parsimony in the learning. Frecency measures the frequency and recency of a given complexity in the population [11].

The analysis of the best model in the fitting considers a score value as presented in Equation (11). This score combines the reduced loss of the discovered equation and its increased complexity related to the precedent candidate.

Score = - \frac{log (\frac{{loss}_{i}}{{loss}_{i - 1}})}{{complexity}_{i} - {complexity}_{i - 1}} .

(11)

The optimization process of PySR is the classical genetic algorithm based on the tournament for individual selection and mutations and crossovers to generate new ones. The software has some adaptations of the algorithms, mainly due to the nature of the model being optimized [11].

A simple evolutionary algorithm starts with a population of individuals, a fitness function, and a set of mutation operators. The algorithm evaluates each individual’s fitness in a randomly chosen subset of individuals (e.g., ns-sized, where ns = 2 for standard tournament selection), and the fittest one is selected with a probability p. If it is not selected, the process repeats until only one remains. A copy of the designated winner individual undergoes a randomly chosen mutation. Finally, the mutated individual replaces the weakest member of the population. PySR, at this last step, replaces the eldest member, known as “age-regularized” evolution. Simulated annealing algorithm selects individuals, controlling individuals’ increase and narrowing diversity by setting a high or low temperature, respectively, [11].

The second modification of PySR from the classical evolution algorithm is adapting for the mathematical expressions, introducing the loop of evolve–simplify–optimize. The evolve step is the evolution selection as previously described; simplify is the simplification phase of reducing equations to an equivalent, simpler form using algebraic transformations, Finally, the optimization step uses a classical optimization algorithm to fine-tune constants within the equations.

The third and last modification is the adaptative parsimony metric already described previously, i.e., using frecency instead of the complexity of the expression in the loss function, which gives the adaptability and results in the number of expressions at different complexities being roughly the same.

4. Methodology

This section describes the steps performed in this work and details some important aspects of the methodology. The work can be split into three macro steps: data generation, transformation, and modeling. The flowchart in Figure 2 presents the steps and subtasks performed.

4.1. Data Generation

The data generation was performed in the commercial Finite Element Software (FEM) COMSOL^® Multiphysics 6.2, which can be used to simulate many engineering problems, from heat transfer to electromagnetic, by modeling the physical phenomena. The physics used was the electric current (ec) interface in the whole domain, setting the suitable boundary conditions in the electrodes. The drift diffusion reaction model, which implements Equations (2) and (3) in the gas domain, was set up using the Transport of Diluted Species (tds) interface. The ec and tds interfaces are coupled to solve for electric field and charge concentration. The current density in the gas, contributing to the surface charge accumulation and described by Equation (5), was implemented as a variable in the physics, using the solutions of ec and tds. Further details of implementation can be consulted in [9]. All simulations are time-dependent studies, i.e., the solution of the system’s evolution in time. The java code of the time-dependent simulations can be accessed in the repository [39].

The model described in Section 2, whose geometry depicted in Figure 3 inspired by [40], was run 120 times on COMSOL^®, varying the ion-pair generation rate and the potential applied. The results from the time-dependent simulations compose the final datasets in the learning tasks, from which the target and predictive variables values were extracted. The total simulation time was 8000 h for each pair of varied parameters, with 1000 time sampling points uniformly distributed.

Additional electrostatic simulations, which solve the Poisson PDE for potential without the DDR formulation and surface charging, were performed with varying potential applied in the same geometry of Figure 3 and the same material properties used in the previous 120 simulation runs described. The results from the electrostatic simulations were only used as predictive variables in the learning tasks, not being used as targets.

4.2. Data Transformation

This work involved learning tasks focused on different targets to predict. Three were related to the system’s steady-state conditions, and one focused on modeling the dynamics. In the latter one, the surface charge density derivative in time was the target variable, specifically, the derivative of the time series. The targets for the steady-state condition were the time to steady state, the value of dark currents, and the equivalent gas conductivity. The predictive variables used were the simulations’ input conditions, the system’s time-dependent solutions using the model described in Section 2, and the system’s electrostatic solutions.

The steady state in the time series data was identified by comparing the standard deviation of the time series in an analyzed rolling window and the whole collected time. It is important to note that the timestep in the simulations was roughly 10 h, so the steady-state time was inferred by this method, and the data had up to this precision.

The number of data points used for each rolling window in steady-state identification was 20, and the step was 1, so sequential rolling windows overlapped. A threshold

ϵ

=

10^{- 3}

was applied to identify the reach of the steady state based on a relative deviation for each window. All simulations reached a steady state, so there was no problem with unidentified cases. Therefore, the value considered for the steady state was the one at the last step for all variables.

The analyzed dataset for dynamics was surface charge densities, so each sampled coordinate in the arc length of the gas–solid insulator interface had a computed time to reach the steady state. The maximum value calculated on the arc was used as the time to reach the steady state for each simulation.

There was some scaling of variables to make predictive and target close in order of magnitude. This was important as the algorithms used were sensitive to large differences. Feature engineering was performed as well, specifically for dynamics learning. The list below presents the scaling and feature engineering that was performed.

Conductivity and currents: transformed units to pico multiplying by $10^{12}$ or nano multiplying by $10^{9}$ ;
Time to steady state: converted to hours divided by 3600;
Feature engineering: the reciprocal of exponential of time divided by 300, i.e., $e^{- t / 300}$ ; the reciprocal of variables such as potential and surface electric field.

4.3. Modelling

The SR was used to learn the variables’ values at steady state, and SINDy was used to predict the derivative of surface charge density over time. The learning algorithms were applied to 80% of the data collected in the simulations, called the training set, 15% was used as the validation dataset, and the test error was computed in the 5% not used in training and validation, called the test set. Random sampling without replacement of all collected data split the sets.

Each algorithm, SR and SINDy, had hyperparameters to be tuned to perform the learning task better. The validation dataset was used in this tuning procedure. The process involved try-and-error evaluating the final validation error in each iteration and using a grid search to find the optimal combination of parameters. It is important to highlight that the test dataset was never used in this process; it was only evaluated at the end when all hyperparameters were already chosen.

The hyperparameters tuned in each learning case for SR are presented in the list below. Descriptions of all other parameters can be consulted in the doc strings of the code in the repository indicated on [11]. Other available input parameters in the software were not tuned, and their default value was used.

niterations: maximum number of iterations of the learning algorithm;
unary_operators: allowed unary operators;
binary_operators: allowed binary operators;
maxsize: maximum complexity, related to the number of operations allowed;
model_selection: model selection criteria;
parsimony: punishes complexity;
constraints: set constraints for the equations.

The hyperparameters tuned in the learning for SINDy, mostly related to the STLSQ optimizer and library, are presented in the list below. Descriptions of all other parameters can be consulted in the doc strings of the code in the repository indicated on [37,38]. Other available input parameters in the software were not tuned, and their default value was used.

threshold: minimum coefficient value to not be set to zero;
alpha: weight on the $L_{2}$ regularization;
max_iter: maximum iterations of the optimization algorithm;
polynomial_degree: polynomial degree of the feature library;
include_bias: include the 1s column in the feature library.

The SR implementation in Python used the library PySR described in the paper [11], version 0.19.4; and the SINDy implementation, used the library PySINDy described in the papers [37,38], version 1.7.5.

5. Results

This section presents the results from the simulations and the learning tasks. The focus is on fittings’ performance, i.e., root mean squared error, and the complexity of the final learned equation. The ideal symbolic equation is the one with the lowest complexity to good accuracy, i.e., the point where increasing complexity brings a diminishing rate of increase in accuracy.

Table 1 presents the values for each variable used in the grid. The other parameters used in the simulation are presented in Table 2. The summary statistic of the target of each learning task is presented in Table 3.

5.1. Generated Datasets

This section briefly explores the datasets generated in the simulations so the learning tasks can be better understood. The data extracted in the simulations were related to the current, electric field, charge, and ion concentration. The values in the sampling times can be statistics from the gas volume, i.e., maximum, minimum, and average; sampled values from surfaces; and integrals over surfaces. Table 4 presents the variables extracted from the time-dependent simulations.

The electric and surface electric field datasets, presented in Table 4, were also extracted from the simulations’ electrostatic solutions to be used as predictive variables. The electrostatic solution is 10 times faster to finish than the simulations of electric currents coupled with DDR equations, taking around 8 s. Therefore, an analytical expression that uses electrostatic solutions as predictive variables can be helpful.

Figure 4 presents the dark current variation on the ground electrode by potential and ion-pair generation rate. The results are presented considering two sets of boundaries; Figure 4a is the current only in the gas interface with the electrode; and Figure 4b, the total current considering the solid and gas interfaces. It is possible to see that the gas–ground electrode current at steady state is inversely proportional to potential and directly proportional to ion-pair generation. The total current increase with potential is related to the linear current increase in the solid–ground electrode interface.

Figure 5 presents the surface charge density and the time to steady state by normalized arc length on interface 1, as defined in Figure 3. It is possible to observe that the boundary conditions of zero concentration are satisfied in the electrodes and that the curved part of the solid insulator is an important factor in the shape of the surface charge density.

The time to steady state presents a narrow variation between voltages and ion-pair generation, and the crucial parts are close to the electrodes, where the maximum times occur. The curve near the ground electrode presents some numerical oscillation. This part was removed from the learning task as the variation of the surface charge density in this region is two orders of magnitude lower than the rest of the curve.

Figure 6 presents the evolution of the surface charge density in two arc length positions on interface 1, on condition of

S = 8 \cdot 10^{7}

IP/m³/s and

V = 346

kV. It is possible to observe that interface 1 has positive and negative charging and is monotonic, i.e., there is no switch on the first derivative signal.

5.2. Dark Currents in Steady State

This learning problem aimed to find an expression that relates the ion-pair generation rate and voltage to the dark current in the system at steady state. The symbolic regression was performed using PySR and Table 5 presents the hyperparameters tuned. In the table, +, −, *, /, and ˆ are the binary operators of addition, subtraction, multiplication, division and exponentiation, respectively.

Equation (12) presents the final best fit obtained in terms of accuracy to predict the current in the gas–ground electrode interface, where V is in kV and S in IP/m³/s.

c u r r e n t [pA] = 8.64 \cdot 10^{- 9} S + 6.71 \cdot 10^{- 18} \frac{S^{2}}{V},

(12)

Table 6 presents the set of four equations with the highest complexity found by the symbolic regression fitting of current in the gas–ground electrode interface. Column

R M S E_{v a l}

presents the mean of error and standard deviation for eight validation groups.

The root mean squared error for the test set of Equation (12) is

1.04 \cdot 10^{- 4}

pA and Figure 7 presents the true vs. predicted value for the target using the learned model for the training and test datasets.

Equation (13) presents the final best fit obtained in terms of accuracy to predict the current in the solid–ground electrode interface, where V is in kV and S in IP/m³/s. The niterations hyperparameter’s value used was 250, the other ones had same the values as reported in Table 5. It is important to note that the final dark current through the ground electrode will be a sum between the current in the gas and the solid interfaced with the ground electrode, i.e., summing Equations (12) and (13).

c u r r e n t [pA] = 6.35 \cdot 10^{- 3} V + 4.23 \cdot 10^{- 11} e^{\frac{2.69}{V}} S .

(13)

Table 7 presents the four equations with the highest complexity found by the symbolic regression fitting of current in the solid–ground electrode interface.

The root mean squared error for the test set of Equation (13) is

8.99 \cdot 10^{- 5}

pA and Figure 8 presents the true vs. predicted value for the target using the learned model for the training and test datasets.

5.3. Time to Steady State

For each simulation, the steady-state time was defined as the maximum time it took for the surface charge density to stabilize across the entire arc of interface 1, as described in Figure 3. The symbolic regression was performed using PySR and Table 8 presents the hyperparameters tuned. In the table, +, −, *, /, and ˆ are the binary operators of addition, subtraction, multiplication, division and exponentiation, respectively.

The final best equation for accuracy, untying by score, obtained is presented in Equation (14).

T i m e_{s s t} = 1.52 \cdot 10^{4} \tilde{E} r_{i 2}^{E S} {(S - 547 \tilde{E} r_{i 2}^{E S} - 4.72 \cdot 10^{6})}^{- 1} - 3.40 \cdot 10^{3} max (E z_{i 2}^{E S}),

(14)

where time is in hours;

E j_{i k}^{E S}

is the electric field in V/m in the surface of interface k in the j-direction from the electrostatic solution; j = r, z and k = 1, 2. The tilde over the E letter represents the median operator.

Table 9 presents the four equations with the highest complexity found by the symbolic regression fitting of the time to reach a steady state in surface charge density. The variable

\bar{∥ E^{E S} ∥}

in Table 9 is the electric field’s mean norm in the gas volume from the electrostatic solution.

The root mean squared test error of Equation (14) is 110 h, and Figure 9 presents the true vs. predicted value for the target using the learned model.

5.4. Gas Conductivity in Steady State

This learning problem aimed to find an expression that relates the ion-pair generation rate and electric field in the r-direction to the conductivity of gas at steady state from the time-dependent simulations of the model described in Section 2. The log value of conductivity was modeled, which is better for the search as it conditions the loss during the learning. The symbolic regression was performed using PySR and Table 10 presents the hyperparameters tuned. In the table, +, -, *, /, and ˆ are the binary operators of addition, subtraction, multiplication, division and exponentiation, respectively

Equation (15) presents the final best fit obtained in terms of accuracy and complexity to predict the conductivity of the gas at a steady state. The equation was chosen based on the score computed to untie the choices, as all highest complexity expressions had the same validation error.

l o g (σ_{g a s}) = 4.63 \cdot 10^{- 8} E r + log (\frac{S}{E r}) - 45.4,

(15)

where

σ_{g a s}

is the gas electric conductivity in S/m, S is in IP/

m^{3}

/s, and

E r

is the electric field in r-direction in V/m.

Table 11 presents the four equations with the highest complexity found by the symbolic regression fittings of conductivity.

The root mean squared error for the test set of Equation (15) is

0.55

log(S/m), and Figure 10 presents the true vs. predicted value for the target using the learned model for the training and test datasets.

5.5. Surface Charge Density Dynamics

The objective of modeling the surface charge density is to obtain an analytical expression as Equation (16). The derivative is computed using simulation data and is the target of this task’s learning. The SINDy algorithm is applied to fit the function f and obtain a sparse symbolic representation.

\frac{\partial σ}{\partial t} = f (X) .

(16)

The candidate library of equations used in SINDy included second-degree polynomials with bias (exponent = 0) excluded. As the magnitude of the predictive variables was low, the optimization process threshold was 0.005. Table 12 presents all hyperparameters selected in the tuning and used in the fitting.

The final equation obtained is presented in Equation (17), where t is in hours, V is potential applied in kV, and

E j_{i 1}

is the electric field in V/m in the surface of interface 1 in the j-direction at time t; j= r, z. The model’s root mean squared test error is

1.12

nC/m²/s, and Figure 11 presents the true vs. predicted value for the target using the learned modeling in the training and test sets.

\frac{\partial σ}{\partial t} = (0.010 E r_{i 1} - 1.108 - 0.076 V - 0.032 E z_{i 1}) e^{- t / 300} + 2.765 e^{- t / 150} .

(17)

6. Discussion

The learning tasks were successfully performed using the symbolic regression and SINDy algorithms, yielding relatively good results based on the validation and test errors. Dark currents and conductivity required low-complexity symbolic equations to fit the data, while time to steady state, even with highly complex equations, could not reach very good results.

The final Equation (12) for dark current in the gas is plausible and has the same behavior related to radiation as observed in the literature [3], i.e., the dark current increases linearly with ion-pair generation rate in the saturation condition reached in steady state. In the lower voltage conditions, the dark current through the gas–electrode is higher in the steady state as there is a higher concentration of ions in the gas that generate the current, which is caused by the lower flux in the interface of the gas–solid insulator. The higher the voltage, the higher the current in the solid insulator and the flux in the gas–solid insulator boundary.

The results derived in [6] disagree with Equation (12) as the dark current increases with the increase in potential applied. Still, it is important to note that the model constructed by the authors is for air and for a different geometry, which does not incorporate the intricate dynamics of the flux in the boundary gas–solid insulator. The authors also observe the linear relation of radiation dosage related to the ion-pair generation rate and the dark current.

The time to steady state does not vary much between the conditions simulated, demonstrating it is not strongly dependent on ion-pair generation rate and voltage. Further analysis, including simulations with varying conductivities and permittivities in the gas and solid, might improve the model with more significant variations in the time to steady state, as different conductivities will influence the system time constant. The relatively high test error indicates that the model will perform even worse for other simulation conditions not present in the training dataset.

Conductivity on the steady state was fitted and has a relatively low test error. Still, it can significantly impact time-dependent simulations as the errors might add up over time. Furthermore, the values of the electric field and ion concentrations might differ considerably in the initial times of the simulation compared to the steady-state ones in which the model was fitted, so it is not currently advisable to use it as a surrogate model. Further investigations are needed to compare simulations using DDR and the analytical expression. The DDR shows that the conductivity of the gas cannot be modeled as constant (see [9] for further reference), so the current analytical expression is a better alternative.

Good model performance is observed in SINDy surface charge density dynamics fitting. This learning task was more demanding, as it had more data points—it is a time series that varies in space, and the surface charging couples gas dynamics and electric field solutions. However, work is needed to improve the expressions for use as a surrogate model in the simulations. The current one demonstrated the power of the algorithm in discovering dynamics from data; even for large amounts of data and a big library, it efficiently fitted a sparse representation.

The dynamic of saturation of the surface charge density observed in Figure 6 is called capacitive-resistive transition, described in many previous papers [1,9] and expected. The charging of positive or negative particles is influenced by the geometry of the solid insulator interface and its proximity to the electrodes, as seen in Figure 5. The fast saturation behavior, observed in the curves on Figure 6, is related to neglecting the interface conductivity of the boundary, so all current from the gas that cannot be dissipated by the current in the solid insulator accumulates at the interface, as can be seen in Equation (1) excluding the last term. The geometry and distance from the electrode create specific points of accumulation on the boundary; the same preferred accumulation near the positive electrode of negative particles is observed in [9]. The varying time to steady state observed in Figure 5b is discussed and derived by [40] as well, although with a different method. As discussed previously, the linear increase in current in the solid with potential is expected and can be seen in Figure 4b.

The physical units of the learned Equations (12) and (14) can be used to infer the physical units of the learned coefficients using dimensional analysis. These units and values can be coupled with prior knowledge about the system to develop its symbolic representation further, for example, by combining system conditions that were not varied in the study to compute the learned coefficients. This further step is out of the scope of this work, but it might be an interesting path to explore with combinatorics and symbolic programming.

One of the challenges in the learning tasks was the considerable discrepancy in the variables’ order of dimension. It interferes mainly with applying the STLS in SINDy, which depends on the absolute value of the coefficients in the pruning based on the threshold step. The discrepancy was overcome by normalizing the variables or transforming the units, for example, in kilo, nano, and pico.

It is important to point out that the system’s physics is simplified, and other phenomena might be important in the simulated conditions, such as the anode’s emittance of electrons, the surface conductivity of the gas–solid insulator interface, and spatial and temporal variation in temperature. The conditions simulated are restricted to the same geometry and material properties. Further enrichment of the dataset with variations of these properties is interesting for building more generally applicable analytical expressions for the HVDC-GIS in RIC conditions.

7. Conclusions

This work explored machine learning methods to derive time to steady state, dark current, gas conductivity, and surface charge density expressions. The learning tasks were completed successfully using simulated data, bringing new analytical expressions useful for analyzing HVDC systems in RIC conditions and further developing machine learning predictions. The absolute error and MAPE reported for predicting dark currents in the gas–ground electrode are

1.04 \cdot 10^{- 4}

pA and 0.01%, respectively, and

8.99 \cdot 10^{- 5}

pA and 0.04% in the solid–ground electrode. The model’s predictions of time to steady state did not perform so well in the test set, reaching around 110 h in error with a MAPE of around 3%. The gas conductivity prediction model performed well in the test set with error of

0.55

log(S/m) and MAPE of 1%. The test error on the dynamics estimation was 1.12 nC/m²/s. Further works with varying gas and solid properties and geometries might improve the performance of the models and make the expressions more useful for more general contexts. All algorithms used in this work might also be applied to experimental data, which might unveil discoveries about the systems’ behaviors.

Author Contributions

Conceptualization, K.U.J., F.L. and N.M.; methodology, K.U.J. and F.L.; software, K.U.J.; formal analysis, K.U.J. and F.L.; data curation, K.U.J.; writing—original draft preparation, K.U.J. and F.L.; writing—review and editing, K.U.J., F.L. and N.M. All authors have read and agreed to the published version of the manuscript.

Funding

Project financially supported by BIRD 2023 Research Program of University of Padova (prot. BIRD235204).

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

RMSE	Root Mean Squared Error
MSE	Mean Squared Error
DDR	Drif-Diffusion-Reation
SR	Symbolic Regression
SINDy	Sparse Identification of Nonlinear Dynamics
STLS	Sequentially Thresholded Least Squares
HVDC	High Voltage Direct Current
GIL	Gas Insulated Line
GIS	Gas Insulated System
AC	Alternating Current
DC	Direct Current
RIC	Radiation-induced conductivity

References

Lucchini, F.; Marconato, N.; Bettini, P. Automatic optimization of gas insulated components based on the streamer inception criterion. Electronics 2021, 10, 2280. [Google Scholar] [CrossRef]
Volpov, E. Electric field modeling and field formation mechanism in HVDC SF/sub 6/gas insulated systems. IEEE Trans. Dielectr. Electr. Insul. 2003, 10, 204–215. [Google Scholar] [CrossRef]
Lucchini, F.; Marconato, N. Development of HVDC Gas-Insulated Components for the Power Supply of Neutral Beam Injectors. IEEE Access 2023, 11, 9731–9741. [Google Scholar] [CrossRef]
Fujiwara, Y.; Hanada, M.; Inoue, T.; Miyamoto, K.; Miyamoto, N.; Ohara, Y.; Okumura, Y.; Watanabe, K. Radiation induced conductivity and voltage holding characteristics of insulation gas for the ITER NBI. AIP Conf. Proc. 1998, 439, 205–216. [Google Scholar] [CrossRef]
Fujiwara, Y.; Inoue, T.; Miyamoto, K.; Miyamoto, N.; Ohara, Y.; Okumura, Y.; Watanabe, K. Influence of radiation on insulation gas at the ITER–NBI system. Fusion Eng. Des. 2001, 55, 1–8. [Google Scholar] [CrossRef]
Hodgson, E.; Moroño, A. A model for radiation induced conductivity in neutral beam injector insulator gases. J. Nucl. Mater. 2002, 307, 1660–1663. [Google Scholar] [CrossRef]
Hodgson, E.; Moroño, A. Radiation effects on insulating gases for the ITER NBI system. J. Nucl. Mater. 1998, 258, 1827–1830. [Google Scholar] [CrossRef]
De Lorenzi, A.; Grando, L.; Pesce, A.; Bettini, P.; Specogna, R. Modeling of epoxy resin spacers for the 1 MV DC gas insulated line of ITER neutral beam injector system. IEEE Trans. Dielectr. Electr. Insul. 2009, 16, 77–87. [Google Scholar] [CrossRef]
Lucchini, F.; Frescura, A.; Urazaki Junior, K.; Marconato, N.; Bettini, P. Modeling Approaches for Accounting Radiation-Induced Effect in HVDC-GIS Design for Nuclear Fusion Applications. Appl. Sci. 2024, 14, 11666. [Google Scholar] [CrossRef]
Romanelli, F. Divertor Tokamak Test facility Project: Status of Design and Implementation. Nucl. Fusion 2024, 64, 112015. [Google Scholar] [CrossRef]
Cranmer, M. Interpretable machine learning for science with PySR and SymbolicRegression.jl. arXiv 2023. [Google Scholar] [CrossRef]
Champion, K.; Lusch, B.; Kutz, J.N.; Brunton, S.L. Data-driven discovery of coordinates and governing equations. Proc. Natl. Acad. Sci. USA 2019, 116, 22445–22451. [Google Scholar] [CrossRef] [PubMed]
Brunton, S.L.; Proctor, J.L.; Kutz, J.N. Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proc. Natl. Acad. Sci. USA 2016, 113, 3932–3937. [Google Scholar] [CrossRef] [PubMed]
Loiseau, J.C.; Brunton, S.L. Constrained sparse Galerkin regression. J. Fluid Mech. 2018, 838, 42–67. [Google Scholar] [CrossRef]
Pecile, A.; Demo, N.; Tezzele, M.; Rozza, G.; Breda, D. Data-driven Discovery of Delay Differential Equations with Discrete Delays. arXiv 2024. [Google Scholar] [CrossRef]
Ebers, M.R.; Steele, K.M.; Kutz, J.N. Discrepancy Modeling Framework: Learning Missing Physics, Modeling Systematic Residuals, and Disambiguating between Deterministic and Random Effects. SIAM J. Appl. Dyn. Syst. 2024, 23, 440–469. [Google Scholar] [CrossRef]
Lore, J.; De Pascuale, S.; Laiu, P.; Russo, B.; Park, J.S.; Park, J.; Brunton, S.; Kutz, J.; Kaptanoglu, A. Time-dependent SOLPS-ITER simulations of the tokamak plasma boundary for model predictive control using SINDy. Nucl. Fusion 2023, 63, 4. [Google Scholar] [CrossRef]
Lucchini, F.; Frescura, A.; Torchio, R.; Alotto, P.; Bettini, P. Reduced order modeling for real-time monitoring of structural displacements due to electromagnetic forces in large scale tokamaks. Plasma Phys. Control. Fusion 2024, 66, 11. [Google Scholar] [CrossRef]
Alves, E.P.; Fiuza, F. Data-driven discovery of reduced plasma physics models from fully kinetic simulations. Phys. Rev. Res. 2022, 4, 033192. [Google Scholar] [CrossRef]
Matteri, A.; Palo, M.; Ogliari, E.; Schubert, B.; Wei, J.; Gu, K.; Liu, W. Separation of radio-frequency signals triggered by valve switching in a HVDC converter by supervised machine learning methods. In Proceedings of the 2021 IEEE International Conference on Environment and Electrical Engineering and 2021 IEEE Industrial and Commercial Power Systems Europe (EEEIC / I&CPS Europe), Bari, Italy, 7–10 September 2021; pp. 1–6. [Google Scholar] [CrossRef]
Narendra, K.; Sood, V.; Khorasani, K.; Patel, R. Investigation into an artificial neural network based on-line current controller for an HVDC transmission link. IEEE Trans. Power Syst. 1997, 12, 1425–1431. [Google Scholar] [CrossRef]
Dash, P.; Routray, A.; Mishra, S. A neural network based feedback linearising controller for HVDC links. Electr. Power Syst. Res. 1999, 50, 125–132. [Google Scholar] [CrossRef]
Beura, C.P.; Wenger, P.; Tozan, E.; Beltle, M.; Tenbohlen, S. Classification of Partial Discharge Sources in HVDC Gas Insulated Switchgear using Neural Networks. In Proceedings of the VDE High Voltage Technology; 4. ETG-Symposium, Berlin, Germany, 8–10 November 2022; VDE: Berlin, Germany, 2022; pp. 1–6. [Google Scholar]
Tuyet-Doan, V.N.; Do, T.D.; Tran-Thi, N.D.; Youn, Y.W.; Kim, Y.H. One-Shot Learning for Partial Discharge Diagnosis Using Ultra-High-Frequency Sensor in Gas-Insulated Switchgear. Sensors 2020, 20, 5562. [Google Scholar] [CrossRef]
Seitz, S.; Götz, T.; Lindenberg, C.; Tetzlaff, R.; Schlegel, S. Towards Generalizable Classification of Partial Discharges in Gas-Insulated HVDC Systems Using Neural Networks: Protrusions and Particles. IEEE Trans. Power Deliv. 2024, 39, 1491–1499. [Google Scholar] [CrossRef]
Chen, Q.; Wu, J.; Li, Q.; Gao, X.; Yu, R.; Guo, J.; Peng, G.; Yang, B. Long Short-Term Memory Network-Based HVDC Systems Fault Diagnosis under Knowledge Graph. Electronics 2023, 12, 2242. [Google Scholar] [CrossRef]
Patil, M.; Paramane, A.; Das, S.; Rao, U.M.; Rozga, P. Hybrid Algorithm for Dynamic Fault Prediction of HVDC Converter Transformer Using DGA Data. IEEE Trans. Dielectr. Electr. Insul. 2024, 31, 2128–2135. [Google Scholar] [CrossRef]
Yousaf, M.Z.; Singh, A.R.; Khalid, S.; Bajaj, M.; Kumar, B.H.; Zaitsev, I. Enhancing HVDC transmission line fault detection using disjoint bagging and bayesian optimization with artificial neural networks and scientometric insights. Sci. Rep. 2024, 14, 23610. [Google Scholar] [CrossRef] [PubMed]
Zhou, R.; Gao, W.; Liu, W.; Ding, D.; Zhang, B. Statistical Feature Extraction Combined with Generalized Discriminant Component Analysis Driven SVM for Fault Diagnosis of HVDC GIS. Energies 2021, 14, 7674. [Google Scholar] [CrossRef]
Lin, C.; Zhu, L.; Lin, X.; Chen, Y.; Chen, H.; Chen, H.; Zheng, Y. AC Surface Flashover Voltage Prediction of Epoxy Resin by Tree-Based Model. In Proceedings of the 5th International Symposium on Plasma and Energy Conversion, Nanjing, China, 27–29 October 2023; Springer Nature: Singapore, 2024; pp. 399–412. [Google Scholar] [CrossRef]
Luo, Y.; Tang, J.; Pan, C.; Yin, J.; Zhang, Y.; Zhang, B.; Zhu, Q. The Transition of Surface Charge Accumulation Dominating Way in HVDC GIS. In Proceedings of the 2018 IEEE International Conference on High Voltage Engineering and Application (ICHVE), Athens, Greece, 10–13 September 2018; pp. 1–4. [Google Scholar] [CrossRef]
Wu, S.; Xu, H.; Zhang, X.; Liang, Y.; Shao, Y.; Wang, C.; Tu, Y.; Xu, Y. Towards the surface flashover in DC GIL/GIS: The electric field distribution and the surface charge accumulation. Phys. Scr. 2022, 97, 072001. [Google Scholar] [CrossRef]
Zhong, J.; Zhang, B.; Guo, Y.; Wang, Z.; Yao, Y.; Zhang, H.; Liu, Y. Gas and solid insulation in HVDC gas insulated switchgear. In Proceedings of the 2017 IEEE Conference on Energy Internet and Energy System Integration (EI2), Beijing, China, 26–28 November 2017; pp. 1–5. [Google Scholar] [CrossRef]
Ma, G.m.; Zhou, H.y.; Li, C.r.; Jiang, J.; Chen, X.w. Designing epoxy insulators in SF 6-filled DC-GIL with simulations of ionic conduction and surface charging. IEEE Trans. Dielectr. Electr. Insul. 2015, 22, 3312–3320. [Google Scholar] [CrossRef]
Qin, S.c.; Tu, Y.p.; Wang, C.; Zhou, F.w.; Ma, G.m.; Zhou, H.y. The influence of the insulator volume conductivity on charge accumulation in HVDC-GIL. In Proceedings of the 2016 IEEE Electrical Insulation Conference (EIC), Montreal, QC, Canada, 19–22 June 2016; pp. 325–328. [Google Scholar] [CrossRef]
Kaheman, K.; Kutz, J.N.; Brunton, S.L. SINDy-PI: A robust algorithm for parallel implicit sparse identification of nonlinear dynamics. Proc. R. Soc. A Math. Phys. Eng. Sci. 2020, 476, 20200279. [Google Scholar] [CrossRef]
de Silva, B.; Champion, K.; Quade, M.; Loiseau, J.C.; Kutz, J.; Brunton, S. PySINDy: A Python package for the sparse identification of nonlinear dynamical systems from data. J. Open Source Softw. 2020, 5, 2104. [Google Scholar] [CrossRef]
Kaptanoglu, A.A.; de Silva, B.M.; Fasel, U.; Kaheman, K.; Goldschmidt, A.J.; Callaham, J.; Delahunt, C.B.; Nicolaou, Z.G.; Champion, K.; Loiseau, J.C.; et al. PySINDy: A comprehensive Python package for robust sparse system identification. J. Open Source Softw. 2022, 7, 3994. [Google Scholar] [CrossRef]
Urazaki Junior, K. 2024. Available online: https://github.com/kjurazaki/HVDC-GIL-physics-informed-ML (accessed on 23 January 2025).
Winter, A.; Kindersberger, J. Transient field distribution in gas-solid insulation systems under DC voltages. IEEE Trans. Dielectr. Electr. Insul. 2014, 21, 116–128. [Google Scholar] [CrossRef]

Figure 1. Expression tree for

1.15 y + 0.86

.

Figure 1. Expression tree for

1.15 y + 0.86

.

Figure 2. Workflow: steps and subtasks performed.

Figure 3. Axisymmetric geometry with conical insulator used in simulations. (a) Domains, (b) interfaces, (c) electrodes.

Figure 4. Dark currents in the ground electrode as a function of potential and ion-pair generation rate. (a) Dark current through the gas–electrode interface. (b) Dark current through the ground electrode.

Figure 5. Steady-state time and surface charge density by normalized arc length on interface 1, as identified in Figure 3. (a) Surface charge density variation along solid insulator boundary; (b) time to steady-state variation along solid insulator boundary.

Figure 6. Evolution in time of surface charge density in two arc positions on interface 1, as identified in Figure 3. (a) Increase in positive surface charge density until saturation on normalized arc length = 0.13; (b) Increase in negative surface charge density until saturation on normalized arc length = 0.4.

Figure 7. Dark currents in gas–ground interface symbolic regression target vs. predicted for the best equation in the training and test set.

Figure 8. Dark currents in solid–ground interface symbolic regression target vs. predicted for the best equation in the training and test set.

Figure 9. Time to steady-state symbolic regression target vs. predicted for the selected equation in the training and test sets.

Figure 10. Conductivity symbolic regression target vs. predicted for the best equation in the training and test set.

Figure 11. Derivative target vs. predicted of the learned equation in training and test sets.

Table 1. Varied parameters S and potential in the simulations.

Parameter	Values
S [IP/(m³·s)]	$10^{7}$ , $4 \cdot 10^{7}$ , $5 \cdot 10^{7}$ , $8 \cdot 10^{7}$ , $10^{8}$ , $2 \cdot 10^{8}$
Potential applied [kV]	15 ÷ 500, uniform 20 data points

Table 2. List of parameters fixed in the simulations.

Parameter	Value
Gas pressure (P) [MPa]	0.4
Recombination coefficient (R) [m³/s]	$6 \cdot 10^{- 13}$
Positive mobility ( $μ_{p}$ ) [m²/(V·s)]	$4.8 \cdot 10^{- 6}$
Negative mobility ( $μ_{n}$ ) [m²/(V·s)]	$4.8 \cdot 10^{- 6}$
Positive diffusion ( $D_{p}$ ) [m²/s]	$1.2 \cdot 10^{- 7}$
Negative diffusion ( $D_{n}$ ) [m²/s]	$1.2 \cdot 10^{- 7}$
Relative permittivity of epoxy [-]	5
Electric conductivity of epoxy (k) [S/m]	$4.2 \cdot 10^{- 17}$
Relative permittivity of SF₆ [-]	1.002

Table 3. Summary statistics of the targets.

Variable	Mean	std	Min/Max
Dark current gas–ground [pA]	0.68	0.52	0.09/1.74
Dark current solid–ground [pA]	1.67	0.93	0.09/3.18
Time to steady-state [h]	3458	171	3067/4484
Surface charge dynamics [nC/m²/s]	−0.51	2.25	−40.75/9.43
Conductivity log [log(S/m)]	−40.29	1.28	−45.69/−37.27

Table 4. Datasets extracted from time-dependent simulations.

Dataset	Description
Dark current	Integral of $J_{G}$ in the ground electrode.
Electric field	Volume integral on the gas, computing maximum, minimum, and average on time of field in r and z directions.
Surface charge density	Sampled charge density values, in the arc and in time, of both solid-gas interfaces.
Surface electric field	Minimum, maximum, and average of the electric field in the surfaces of the solid-gas interface and electrodes in time.
Ion concentration	Volume integral on the gas, computing total ion concentration on time.

Table 5. Hyperparameters tuned for SR on dark currents in gas–ground electrode.

Hyperparameter	Value
niterations	500
unary_operators	exp, log
binary_operators	+, −, *, /,ˆ
maxsize	15
model_selection	accuracy
parsimony	0.002
constraints	ˆ:(−1, 3)

Table 6. Expressions for dark currents in gas–ground electrode fitted by symbolic regression.

$O$ *	Equation	Loss **	Score ***	${RMSE}_{val}$
10	$8.64 \cdot 10^{- 9} S + {(1.02 \cdot 10^{- 9} S)}^{log (V)}$	$1.88 \cdot 10^{- 7}$	$0.47$	$(7.94 \pm 3.86) \cdot 10^{- 4}$
12	$8.64 \cdot 10^{- 9} S + {(2.67 \cdot 10^{- 9} S^{0.954})}^{log (V)}$	$7.33 \cdot 10^{- 8}$	$0.47$	$(5.77 \pm 2.33) \cdot 10^{- 4}$
14	$8.64 \cdot 10^{- 9} (S - V) + {(2.67 \cdot 10^{- 9} S^{0.954})}^{log (V)}$	$7.22 \cdot 10^{- 8}$	$7.08 \cdot 10^{- 3}$	$(5.75 \pm 2.33) \cdot 10^{- 4}$
15	$8.64 \cdot 10^{- 9} S + 6.715 \cdot 10^{- 18} \frac{S^{2}}{V}$	$1.35 \cdot 10^{- 8}$	$1.68$	$(1.72 \pm 0.17) \cdot 10^{- 4}$

* As computed by PySR: the final expressions reported are simplified, ** MSE [pA²], *** As defined in Equation (11).

Table 7. Expressions for dark currents in solid–ground electrode fitted by symbolic regression.

$O$ *	Equation	Loss **	Score ***	${RMSE}_{val}$
9	$6.35 \cdot 10^{- 3} V + 4.34 \cdot 10^{- 11} S$	$5.19 \cdot 10^{- 8}$	$4.17 \cdot 10^{- 2}$	$(3.31 \pm 1.69) \cdot 10^{- 4}$
11	$6.35 \cdot 10^{- 3} V + 4.23 \cdot 10^{- 11} (S + \frac{S}{V})$	$3.22 \cdot 10^{- 8}$	$0.24$	$(3.07 \pm 1.82) \cdot 10^{- 4}$
12	$6.35 \cdot 10^{- 3} V + 4.23 \cdot 10^{- 11} e^{\frac{1.65}{V}} S$	$2.04 \cdot 10^{- 8}$	$0.45$	$(2.59 \pm 1.61) \cdot 10^{- 4}$
13	$6.35 \cdot 10^{- 3} V + 4.23 \cdot 10^{- 11} e^{\frac{2.69}{V}} S$	$1.26 \cdot 10^{- 8}$	$0.49$	$(2.01 \pm 1.24) \cdot 10^{- 4}$

* As computed by PySR: the final expression reported are simplified, ** MSE [

p A^{2}

], *** As defined in Equation (11).

Table 8. Hyperparameters tuned for SR on time to steady state.

Hyperparameter	Value
niterations	10,000
unary_operators	exp, log
binary_operators	+, −, *, /,ˆ
maxsize	20
model_selection	accuracy
parsimony	0.0005
constraints	ˆ:(−1, 3)

Table 9. Expressions for time to reach steady state fitted by symbolic regression.

$O$ *	Equation	Loss ** ( $\times 10^{4}$ )	Score *** ( $\times 10^{- 4}$ )	${RMSE}_{val}$
16	$max (E z_{i 2}^{E S}) {(8.47 \cdot 10^{- 5} S - 4.08 \cdot 10^{- 2} \bar{∥ E^{E S} ∥} - 342)}^{- 1} + log (\tilde{E} r_{i 2}^{E S}) + 3.48 \cdot 10^{3}$	$1.80$	$18.2$	$(90 \pm 8)$
17	$1.52 \cdot 10^{4} \tilde{E} r_{i 2}^{E S} {(S - 547 \tilde{E} r_{i 2}^{E S} - 4.72 \cdot 10^{6})}^{- 1} - 3.40 \cdot 10^{3} max (E z_{i 2}^{E S})$	$1.63$	988	$(85 \pm 25)$
19	$2.12 \cdot 10^{- 4} max (E z_{i 2}^{E S}) + 1.52 \cdot 10^{4} \tilde{E} r_{i 2}^{E S} - 5.3 \cdot 10^{7} {(S - 547 \tilde{E} r_{i 2}^{E S} - 4.83 \cdot 10^{6})}^{- 1} + 3.40 \cdot 10^{3}$	$1.63$	$4.15$	$(85 \pm 25)$
20	$(6.53 \cdot 10^{- 5} + {(6.43 \cdot 10^{- 5} S - 3.51 \cdot 10^{- 2} \tilde{E} r_{i 2}^{E S} - 311)}^{- 1}) \cdot \tilde{E} r_{i 1}^{E S} - log (S) + 3.41 \cdot 10^{3}$	$1.63$	$27.0$	$(85 \pm 25)$

* As computed by PySR: the final expression reported are simplified, ** MSE [

h^{2}

], *** As defined in Equation (11).

Table 10. Hyperparameters tuned for SR on conductivity.

Hyperparameter	Value
niterations	250
unary_operators	exp, log
binary_operators	+, −, *, /,ˆ
maxsize	15
model_selection	accuracy
parsimony	0.002
constraints	ˆ:(−1, 3)

Table 11. Expressions for conductivity fitted by symbolic regression.

$O$ *	Equation	Loss **	Score ***	${RMSE}_{val}$
8	$log (0.538 + \frac{S}{E r}) - 45.4$	$0.276$	$2.19 \cdot 10^{- 4}$	$0.53 \pm 0.01$
10	$4.63 \cdot 10^{- 8} E r + log (\frac{S}{E r}) - 45.4$	$0.275$	$2.18 \cdot 10^{- 3}$	$0.53 \pm 0.01$
12	$4.63 \cdot 10^{- 8} E r + log (\frac{S}{E r} - 0.205) - 45.4$	$0.275$	$4.70 \cdot 10^{- 5}$	$0.53 \pm 0.01$
14	$4.63 \cdot 10^{- 8} E r + log (4.63 \cdot 10^{- 8} E r + \frac{S}{E r}) - 45.4$	$0.275$	$8.29 \cdot 10^{- 5}$	$0.53 \pm 0.01$

* As computed by PySR: the final expressions reported are simplified, ** MSE [pA²], *** As defined in Equation (11).

Table 12. Hyperparameters tuned for SINDy on surface charge density dynamics.

Hyperparameter	Value
threshold	0.005
alpha	$0.1$
max_iter	20
polynomial_degree	2
include_bias	False

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Urazaki Junior, K.; Lucchini, F.; Marconato, N. Data-Driven Dynamics Learning on Time Simulation of SF₆ HVDC-GIS Conical Solid Insulators. Electronics 2025, 14, 616. https://doi.org/10.3390/electronics14030616

AMA Style

Urazaki Junior K, Lucchini F, Marconato N. Data-Driven Dynamics Learning on Time Simulation of SF₆ HVDC-GIS Conical Solid Insulators. Electronics. 2025; 14(3):616. https://doi.org/10.3390/electronics14030616

Chicago/Turabian Style

Urazaki Junior, Kenji, Francesco Lucchini, and Nicolò Marconato. 2025. "Data-Driven Dynamics Learning on Time Simulation of SF₆ HVDC-GIS Conical Solid Insulators" Electronics 14, no. 3: 616. https://doi.org/10.3390/electronics14030616

APA Style

Urazaki Junior, K., Lucchini, F., & Marconato, N. (2025). Data-Driven Dynamics Learning on Time Simulation of SF₆ HVDC-GIS Conical Solid Insulators. Electronics, 14(3), 616. https://doi.org/10.3390/electronics14030616

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Data-Driven Dynamics Learning on Time Simulation of SF₆ HVDC-GIS Conical Solid Insulators

Abstract

1. Introduction

2. Electromagnetic Modeling

3. Machine Learning Methods

3.1. SINDy

3.2. Symbolic Regression

4. Methodology

4.1. Data Generation

4.2. Data Transformation

4.3. Modelling

5. Results

5.1. Generated Datasets

5.2. Dark Currents in Steady State

5.3. Time to Steady State

5.4. Gas Conductivity in Steady State

5.5. Surface Charge Density Dynamics

6. Discussion

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI