Machine Learning Applications For Blast Performance Assessment of Cold Formed Steel Girt Systems
Machine Learning Applications For Blast Performance Assessment of Cold Formed Steel Girt Systems
Machine Learning Applications For Blast Performance Assessment of Cold Formed Steel Girt Systems
(i) Student, Civil Engineering Department, The British University in Egypt, Egypt
(ii) Assistant Professor, Civil Engineering Department, The British University in Egypt,
Egypt
ABSTRACT
Within the past few decades the awareness against improvised explosive devices has
escalated due to the increasing number and magnitudes of terrorist attacks. Moreover,
steel cladding has been extensively used within the applications of accelerated blast
mitigation construction. In this context, numerous experimental investigations have been
conducted to evaluate the blast performance of cold-formed steel girts, yet no design
tools are available. This shortage is due to the complexity of proposing a simplified tool
that accounts for the multiple input parameters as the load and resistance and their
accompanied uncertainties. The aim of this paper is to derive reliable design tools for
steel cold-formed girts subjected to blast loading using different machine learning
approaches. These tools were developed through machine learning approaches that
devise predictive models. This paper lays the foundation for a novel preliminary blast
design approach that would be applicable for the assessment of different blast mitigation
systems.
Blast loads have been considered in different applications for hundreds of years
(i.e. peace and war applications). However, the blast effects and their mitigation
techniques/measures have been extensively investigated/developed within the past few
decades (Krauthammer, 2008). These developments are mainly attributed to the increase
of terrorism risk (number and magnitude) and the enhanced awareness of their negative
impacts on the different community aspects (economic, social,
environmental,…etc.)(Salem, Campidelli, El-Dakhakhni, & Tait, 2018). Moreover, blast
wave parameters are highly dependent on the explosive device which constitutes large
source of uncertainty, including chemical attacks, vapors, high explosives (Krauthammer,
2008). As such, most of the available literature usually consider the blast wave
parameters rather than their sources (Krauthammer, 2008).
On the other hand, cold-formed steel (CFS) structures have been in service for many
years and used for domestic and industrial purposes due to their accelerated construction
nature (Darcy, 2005). Moreover, Blast–resistant design using CFS has also been the
focus of several researches due to their direct exposure to free field blast events (i.e.
typically used as cladding). (Lane, 2003; Salim & Townsend, 2004; Woodson & DiPaolo).
Steel girt systems, one form of CFS, are considered an effective blast-resistant cladding
system compared to high-performance systems such as precast/prestressed concrete
facades (Aviram et al., 2012; Godinho et al., 2013). Steel girt systems are secondary
framing members that are placed horizontally to provide lateral support for the corrugate
sheets to resist surface loads as depicted in Fig.(1).
In this context, this paper introduces different statistical models to assess the blast
performance of steel girt systems. The statistical models are derived using different
machine learning techniques through fitting the experimental results of far field testing for
different steel girt systems. These models assess the performance of the steel girts
through assessing the deformation (i.e. current North American response limit ASCE,
2011; CSA, 2012). The developed models account for different independent variables;
for example, these independent variables are blast intensity, material, and geometrical
characteristics. The proposed models (i.e. regression) are mainly selected due to its
validity while using limited number of data sets. This paper introduces different multiple
regression analysis techniques for far field blast testing of steel girt systems. The
presented regressions are using multivariate linear, polynomial, and random forest tree
regressions. Finally, the reliability of the presented regression models is assessed using
the R score, and root mean square error (RMSE).
METHODOLOGY
MACHINE LEARNING
According to Neter et al. (1996) there are five assumptions that should be made for a
MVLR. These assumptions include existence, independence, linearity of relationships,
homogeneity, and normality. Existence is defined as a specific combination of
independent variables X1, X2, X3,.., Xj and the output variable Y is a random variable with
a defined average and variance in a specific probability distribution. While independence
refers to the Y values which are independent and do not have any relation with each
other. Linearity of Relationships means that the average value of Y is a function of the
linear combination of X1, X2, X3,.., Xj. Moreover, homogeneity of Variance is having a
constant variances in the linear combination of the independent variables (X1, X2, X3,..,
Xj). And finally, normality is obtained through distributing the dependent variable by the
linear combination of X1, X2, X3,.., Xj.
𝑌 = 𝛽0 + 𝛽1 𝑋1 + … … . . 𝛽𝑁 𝑋 𝑁 𝑁 + 𝜀 Equation (2)
Random Forest Regression (RFR) is based on the method of decision trees (DT). DT
utilizes a decision-making framework based on the information theory, a mathematical
model used to store information in data (Wang & Zang, 2011). DT has the capability to
predict both continuous (regression) and categorical (classification) data. RFR is
considered as a piecewise regression, where the exact regression equation depends on
the data point features and how the trees are structured. DT is structured similarly to a
real tree in which it has a base called the “root node” and “leaf node” which can be thought
of as the actual prediction. DT progresses from one node to another node through “Arcs”.
The arcs yield all the information of the previous nodes including the one it originated
from as shown in Fig. (2). DT works by starting from the root node and traversing to reach
the leaf node. This traversal is done through inquiring about a series of questions
regarding the characteristics of the data set. The DT asks questions regarding the
inequalities of the nodes value and the actual value and then proceeds to the following
node until it reaches the leaf node (i.e. the prediction).
Figure 2. Illustration of a
Decision Tree.
To avoid any overfitting of a DT, several trees can be used simultaneously together to
form several predictions, which is known as the RFR technique. After creating a prediction
from each single tree, the values are averaged to mitigate the performance of any faulty
single tree.
MODEL RELIABILITY
To determine the accuracy of the outcome provided by the regressors, a scale should be
used to compare the accuracy of the different models. In the presented paper, the
coefficient of determination (i.e. R2) score along with the Root Mean Square Error (RMSE)
were used. R2 is the proportion of variance of the dependent variable by the independent
variable and can be calculated using Eq. (3) (Cameron & Windmeijer, 1995). While, the
RMSE is the standard deviation of the residuals (prediction errors) as illustrated in Eq. (4)
(Chai & Draxler, 2014). Residuals of the RMSE indicates for how far the predictions from
the regression line data points (HAYES, 2019).
𝐸𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑉𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
𝑅2 = 1 − Equation (3)
𝑇𝑜𝑡𝑎𝑙 𝑉𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
∑𝑁
𝑖=1(𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑𝑖 −𝐴𝑐𝑡𝑢𝑎𝑙𝑖 )
2
𝑅𝑀𝑆𝐸 = √ Equation (4)
𝑁
DATABASE
The blast response of steel girt systems is influenced by multiple parameters such as the
material properties, cladding thickness, girt spacing, boundary conditions, and the applied
blast wave. Typically, the positive peak pressure (Pmax) and positive specific impulse (I)
are the most influential factors on the response of structural members (Salem, Campidelli,
El-Dakhakhni, & Tait, 2019; Shin, Whittaker, & Cormie, 2015). The positive peak pressure
is the sudden rise of the ambient pressure (Po) due to detonation, while the specific
impulse is the integration of the pressures over the loading duration (td) as shown in Fig.
(3). As such, including all the aforementioned parameters in training the regression
models may yield in a complicated model influenced by each input variable.
PRESSURE (MPa)
PMax
P0
td
Yet, all the input parameters are still influencing the expected outcomes. Consequently,
the influencing parameters are lumped into two groups, namely, resistance and loading
groups. The resistance group is an indication for the resistance function of the steel girt
system derived using different input parameters as the material properties, cladding
thicknesses, girt spacing, and boundary conditions. The resistance group is presented in
the form of a resistance function developed by the Unified Facilities Criteria ((UFC),
2008). The UFC proposed an idealized bilinear function for steel systems presented by
the initial stiffness “K” and ultimate capacity “Ru” as shown in Fig. (4).
I
P Ru K
kPa (psi)
kPa - ms (psi -
kPa (psi) kPa/m (psi/in)
θ
ms)
To ensure the efficiency of the used data; mutual variable correlation diagrams are plotted
to visualize the distribution of the parameters (input and output) as shown in Fig. (5). Fig.
(5) presents the relationship between the variables and their frequencies. For example,
the bar chart in the top left corner represent the frequency of the peak pressure within the
used database, while the rest of the row represents the variability of the considered I, Ru,
K, and θ.
SENSITIVITY
After the data was extracted, regression models using RFR, MVLR, and
Polynomial regression were created through the scikit-learn library (Pedregosa, et al.,
2011). The developed models, specially RFR and polynomial models, had several
parameters controlling the accuracy of the model. For example, the accuracy of the RFR
is related to the number of trees used to predict the outcome. It is worth nothing that there
is no limit to the number of trees that can be used to create a RFR model. Subsequently,
a sensitivity analysis is performed to optimize the number of trees. To assess the
sensitivity, the R2 values were used as an indication for the model stability. Fig. (6) shows
the model sensitivity to the number of trees used. Moreover, Fig. (6) demonstrates that
the optimum number of trees would be around 20 to 25 trees. This optimum decision is
based on the fact that the R2 did not significantly enhance after using more than 25 DT,
however, increasing the DT than this limit would increase the processing time. On the
contrary, less than 20 may yield into unreliable model. As such, the RFR model used in
this study is based on with 21 trees as the sample size.
0.98
0.96
0.94
R Score
0.92
0.9
0.88
0.86
0.84
5 10 15 20 25 30 35 40
Number Of Trees
Similarly, Polynomial Regression provided good results when increasing the degree,
however, it is prone to overfitting. The model was tested at different degrees against the
R2 as shown in Fig. (7). In Fig. (7), it is evident that the model started to overfit once it
reached to the 3rd-degree (R2=1.0). It is also evident that the 1st-degree equation offers
the same value as the MVLR which is a verification to the model. Therefore a 2nd-degree
equation is used to avoid any overfitting.
1.2
0.8
R Score
0.6
0.4
0.2
0
1 2 3 4
Degree Of Equation
For each regression model the R score, and RMSE was calculated to assess their
reliability. The results of the R score, and RMSE are summarized in Table 2. From Table
2, it is apparent that MVLR has the worst values. MVLR yielding the lowest value R score
(𝑅 2 = 0.53), which indicates that the data is not fitting in linear manner(Ang & Tang, 2007).
This low R value is due to the fact that MVLR is derived using the first-degree equation,
however, the data fit differently. Both RFR and the 2nd Degree Polynomial Regression
provided satisfactory R score, and RMSE scores. However, it is more reliable to use the
RFR model due to not being affected by overfitting compared to the 2nd degree polynomial
regression. Moreover, the reliability of the models was tested against the experimental
for the 21 data set as shown in Fig. (8).
Table 2. R score (R2), and RMSE values for the proposed models.
10
8
6
4
2
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Sample Number
CONCLUSIONS:
This paper developed regression models for blast assessment of steel girt systems using
machine learning techniques, namely, MVLR, polynomial, and RFR regressions. The
developed models were based on the results of real explosive testing database. The
reliability of the models was assessed using R2 and RMSE. The results of the reliability
assessment showed that the RFR was the most reliable model with highest accuracy and
no overfitting. In the meanwhile, the polynomial regression offered a higher level of
accuracy, however, it starts to overfit at the 3rd degree. The 2nd degree polynomial
regression may be accurate in terms of R2, however, there is no evidence that it did not
already start overfitting.
The presented models constitute cheap computational models that can be used in
further probabilistic blast performance investigations for steel girt systems. On the other
side, the used regression models had the capability of using limited data, as main
advantage of using machine learning, expanding the test database may yield into more
reliable models. Additionally, different predictive models and reliability assessment
indices may yield into reliable models as well.
REFERENCES: